The CfgTraits abstraction simplfies writing algorithms that are
generic over the type of CFG, and enables writing such algorithms
as regular non-template code that operates on opaque references
to CFG blocks and values.
Implementations of CfgTraits provide operations on the concrete
CFG types, e.g. `IrCfgTraits::BlockRef` is `BasicBlock *`.
CfgInterface is an abstract base class which provides operations
on opaque types CfgBlockRef and CfgValueRef. Those opaque types
encapsulate a `void *`, but the meaning depends on the concrete
CFG type. For example, MachineCfgTraits -- for use with MachineIR
in SSA form -- encodes a Register inside CfgValueRef. Converting
between concrete references and opaque/generic ones is done by
CfgTraits::{fromGeneric,toGeneric}. Convenience methods
CfgTraits::{un}wrap{Iterator,Range} are available as well.
Writing algorithms in terms of CfgInterface adds some overhead
(virtual method calls, plus in same cases it removes the
opportunity to inline iterators), but can be much more convenient
since generic algorithms can be written as non-templates.
This patch adds implementations of CfgTraits for all CFGs on
which dominator trees are calculated, so that the dominator
tree can be ported to this machinery. Only IrCfgTraits (LLVM IR)
and MachineCfgTraits (Machine IR in SSA form) are complete, the
other implementations are limited to the absolute minimum
required to make the upcoming dominator tree changes work.
v5:
- fix MachineCfgTraits::blockdef_iterator and allow it to iterate over
the instructions in a bundle
- use MachineBasicBlock::printName
v6:
- implement predecessors/successors for all CfgTraits implementations
- fix error in unwrapRange
- rename toGeneric/fromGeneric into wrapRef/unwrapRef to have naming
that is consistent with {wrap,unwrap}{Iterator,Range}
- use getVRegDef instead of getUniqueVRegDef
v7:
- std::forward fix in wrapping_iterator
- fix typos
v8:
- cleanup operators on CfgOpaqueType
- address other review comments
Change-Id: Ia75f4f268fded33fca11218a7d578c9aec1f3f4d
Differential Revision: https://reviews.llvm.org/D83088
D70365 allows us to make attributes default. This is a follow up to
actually make nosync, nofree and willreturn default. The approach we
chose, for now, is to opt-in to default attributes to avoid introducing
problems to target specific intrinsics. Intrinsics with default
attributes can be created using `DefaultAttrsIntrinsic` class.
This allows building the clang-format unit tests in only 657 ninja steps
rather than 1257 which allows for much faster incremental builds after a
git pull.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D89709
This allows removing the clangAST dependency from libclangToolingCore and
therefore allows clang-format to be built without depending on clangAST.
Before 1166 files had to be compiled for clang-format, now only 796.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D89708
This fixes miscomputation of __builtin_constant_evaluated in the
initializer of a variable that's not usable in constant expressions, but
is readable when constant-folding.
If evaluation of a constant initializer fails, we throw away the
evaluated result instead of keeping it as a non-constant-initializer
value for the variable, because it might not be a correct value.
To avoid regressions for initializers that are foldable but not formally
constant initializers, we now try constant-evaluating some globals in
C++ twice: once to check for a constant initializer (in an mode where
is_constannt_evaluated returns true) and again to determine the runtime
value if the initializer is not a constant initializer.
* Make cc1 and cc1as --compress-debug-sections an alias for --compress-debug-sections=zlib
* Make -gz an alias for -gz=zlib
The new behavior is consistent with GCC when binutils>=2.26 is detected:
-gz is translated to --compress-debug-sections=zlib instead of --compress-debug-sections.
The name is unfortunate because it is similar to the driver option -ftest-coverage.
It turns out aside from one occurrence in a test, this option is not used.
Instead of framing the interface around whether the variable is an ICE
(which is only interesting in C++98), primarily track whether the
initializer is a constant initializer (which is interesting in all C++
language modes).
No functionality change intended.
for which it matters.
This is a step towards separating checking for a constant initializer
(in which std::is_constant_evaluated returns true) and any other
evaluation of a variable initializer (in which it returns false).
Recently commit D78699 (commit 26cfb6e562), fixed clang's behavior with respect
to passing a union type through a register to correctly follow the ABI. However,
this is an ABI breaking change with earlier versions of the clang compiler, so we
should add an -fclang-abi-compat option to address this. Additionally, the PS4 ABI
requires the older behavior, so that is added as well.
This change adds a Ver11 value to the ClangABI enum that when it is set (or the
target is the PS4 triple), we skip the ABI fix introduced in D78699.
Differential Revision: https://reviews.llvm.org/D89747
Update clang/lib/Lex to stop relying on a `MemoryBuffer*`, using the
`MemoryBufferRef` from `getBufferOrNone` since both locations had logic
for checking validity of the buffer. There's potentially a functionality
change, since the logic was wrong (it checked for `nullptr`, which was
never returned by the old API), but if that was reachable the new
behaviour should be better.
Differential Revision: https://reviews.llvm.org/D89402
Update `Lexer` / `Lexer::Lexer` to use `MemoryBufferRef` instead of
`MemoryBuffer*`. Callers that were acquiring a `MemoryBuffer*` via
`SourceManager::getBuffer` were updated, such that if they checked
`Invalid` they use `getBufferOrNone` and otherwise `getBufferOrFake`.
Differential Revision: https://reviews.llvm.org/D89398
Measure amount of high-level or fixed-cost operations performed during
building/loading modules and during header search. High-level operations
like building a module or processing a .pcm file are motivated by
previous issues where clang was re-building modules or re-reading .pcm
files unnecessarily. Fixed-cost operations like `stat` calls are tracked
because clang cannot change how long each operation takes but it can
perform fewer of such operations to improve the compile time.
Also tracking such stats over time can help us detect compile-time
regressions. Added stats are more stable than the actual measured
compilation time, so expect the detected regressions to be less noisy.
On relanding drop stats in MemoryBuffer.cpp as their value is pretty low
but affects a lot of clients and many of those aren't interested in
modules and header search.
rdar://problem/55715134
Reviewed By: aprantl, bruno
Differential Revision: https://reviews.llvm.org/D86895
PartialDiagnostic misses some functions compared to DiagnosticBuilder.
This patch refactors DiagnosticBuilder and PartialDiagnostic, extracts
the common functionality so that the streaming << operators are
shared.
Differential Revision: https://reviews.llvm.org/D84362
Update clang/lib/Format and clang/lib/Rewrite to use a `MemoryBufferRef`
from `getBufferOrFake` instead of `MemoryBuffer*` from `getBuffer`.
No functionality change here, since the call sites weren't checking if
the buffer was valid.
Differential Revision: https://reviews.llvm.org/D89406
The changes made in D88594 caused the test OpenMP/driver.c to fail on a 32-bit host becuase it was offloading to a 64-bit architecture by default. The offloading test was moved to a new file and a feature was added to the lit config to check for a 64-bit host.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D89696
SourceLocation implements `operator<`, so `SourceLocation`-s can be used
as keys in `std::map` directly, there is no need to extract the internal
representation.
Since the `operator<` simply compares the internal representations of
its operands, this patch does not introduce any functional changes.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D89705
- Extend hip-toolchin-features.hip to also check the lld attributes
are passed correctly.
- Add check for cumode attributes.
Differential Revision: https://reviews.llvm.org/D89636
This broke Chromium's PGO build, it seems because hot-cold-splitting got turned
on unintentionally. See comment on the code review for repro etc.
> This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
> the splitting pass to be toggled on/off. The current method of passing
> `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
> correctly (say, with `-O0` or `-Oz`).
>
> To implement the -fsplit-cold-code option, an attribute is applied to
> functions to indicate that they may be considered for splitting. This
> removes some complexity from the old/new PM pipeline builders, and
> behaves as expected when LTO is enabled.
>
> Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
> Differential Revision: https://reviews.llvm.org/D57265
> Reviewed By: Aditya Kumar, Vedant Kumar
> Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar
This reverts commit 273c299d5d.
Some projects (e.g. FreeBSD) align pointers to the right but expect a
space between the '*' and any pointer qualifiers such as const. To handle
these cases this patch adds a new config option SpaceAroundPointerQualifiers
that can be used to configure whether spaces need to be added before/after
pointer qualifiers.
PointerAlignment = Right
SpaceAroundPointerQualifiers = Default/After:
void *const *x = NULL;
SpaceAroundPointerQualifiers = Before/Both
void * const *x = NULL;
PointerAlignment = Left
SpaceAroundPointerQualifiers = Default/Before:
void* const* x = NULL;
SpaceAroundPointerQualifiers = After/Both
void* const * x = NULL;
PointerAlignment = Middle
SpaceAroundPointerQualifiers = Default/Before/After/Both:
void * const * x = NULL;
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D88227
Implementing the likelihood attributes for the iteration statements adds
a new helper function. This function can't be const qualified since
these non-modifying members aren't const qualified.
The semantics associated with `__vector [un]signed long` are neither
consistently specified nor consistently implemented.
The IBM XL compilers on AIX traditionally treated these as deprecated
aliases for the corresponding `__vector int` type in both 32-bit and
64-bit modes. The newer, Clang-based, IBM XL compilers on AIX make usage
of the previously deprecated types an error. This is also consistent
with IBM XL C/C++ for Linux on Power (on little endian distributions).
In line with the above, this patch upgrades (on AIX) the deprecation of
`__vector long` to become removal.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D89443
This implements the likelihood attribute for the switch statement. Based on the
discussion in D85091 and D86559 it only handles the attribute when placed on
the case labels or the default labels.
It also marks the likelihood attribute as feature complete. There are more QoI
patches in the pipeline.
Differential Revision: https://reviews.llvm.org/D89210
initialization a little smarter.
Look through casts that preserve zero-ness when determining if an
initializer is zero, so that we can handle cases like an {0} initializer
whose corresponding field is a type other than 'int'.
Old GCC used to aggressively fold VLAs to constant-bound arrays at block
scope in GNU mode. That's non-conforming, and more modern versions of
GCC only do this at file scope. Update Clang to do the same.
Also promote the warning for this from off-by-default to on-by-default
in all cases; more recent versions of GCC likewise warn on this by
default.
This is still slightly more permissive than GCC, as pointed out in
PR44406, as we still fold VLAs to constant arrays in structs, but that
seems justifiable given that we don't support VLA-in-struct (and don't
intend to ever support it), but GCC does.
Differential Revision: https://reviews.llvm.org/D89523
ClangFormat does not correctly handle an Objective-C interface declaration
with both lightweight generics and a protocol conformance.
This simple example:
```
@interface Foo : Bar <Baz> <Blech>
@end
```
means `Foo` extends `Bar` (a lightweight generic class whose type
parameter is `Baz`) and also conforms to the protocol `Blech`.
ClangFormat should not apply any changes to the above example, but
instead it currently formats it quite poorly:
```
@interface Foo : Bar <Baz>
<Blech>
@end
```
The bug is that `UnwrappedLineParser` assumes an open-angle bracket
after a base class name is a protocol list, but it can also be a
lightweight generic specification.
This diff fixes the bug by factoring out the logic to parse
lightweight generics so it can apply both to the declared class
as well as the base class.
Test Plan: New tests added. Ran tests with:
% ninja FormatTests && ./tools/clang/unittests/Format/FormatTests
Confirmed tests failed before diff and passed after diff.
Reviewed By: sammccall, MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D89496
This addresses a regression where pretty much all C++ compilations using
-frounding-math now fail, due to rounding being performed in constexpr
function definitions in the standard library.
This follows the "manifestly constant evaluated" approach described in
https://reviews.llvm.org/D87528#2270676 -- evaluations that are required
to succeed at compile time are permitted even in regions with dynamic
rounding modes, as are (unfortunately) the evaluation of the
initializers of local variables of const integral types.
Differential Revision: https://reviews.llvm.org/D89360
`SourceManager::createFileID` asserts that the given `FileEntry` is not
null, so remove the logic that passed in `nullptr`. Since we just added
the file to an in-memory FS via an API that cannot fail, use
`llvm_unreachable` on the error path. Didn't use an `assert` since it
seems cleaner semantically to check the error (and better,
hypothetically, for updating the API to use `Expected` instead of
`ErrorOr`).
I noticed this incidentally while auditing calls to `createFileID`.
This patch makes sure that the instance of TypeSize comparison operator
is done with a fixed type size.
Differential Revision: https://reviews.llvm.org/D89312
After investigation by @asbirlea, the issue that caused the
revert appears to be an issue in the original source, rather
than a problem with the compiler.
This patch enables MemorySSA DSE again.
This reverts commit 915310bf14.
This test was failing in our CI environment, because Jenkins mounts the workspaces into Docker containers using their full path, i.e. /home/jenkins/workspaces/llvm-build.
We've seen permission denied errors because /home/jenkins is mounted with root permissions and the default cache directory under Linux is $HOME/.cache.
The fix is to explicitly provide the -fmodules-cache-path, which the other tests already seem to provide.
Reviewed By: akyrtzi
Differential Revision: https://reviews.llvm.org/D89453
- The goal of this patch is improve option compatible with RISCV-V GCC,
-mcpu support on GCC side will sent patch in next few days.
- -mtune only affect the pipeline model and non-arch/extension related
target feature, e.g. instruction fusion; in td file it called
TuneFeatures, which is introduced by X86 back-end[1].
- -mtune accept all valid option for -mcpu and extra alias processor
option, e.g. `generic`, `rocket` and `sifive-7-series`, the purpose is
option compatible with RISCV-V GCC.
- Processor alias for -mtune will resolve according the current target arch,
rv32 or rv64, e.g. `rocket` will resolve to `rocket-rv32` or `rocket-rv64`.
- Interaction between -mcpu and -mtune:
* -mtune has higher priority than -mcpu for pipeline model and
TuneFeatures.
[1] https://reviews.llvm.org/D85165
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D89025
folding to not constant folding.
Constant folding of ICEs is done as a GCC compatibility measure, but new
code was picking it up, presumably by accident, due to the bad default.
While here, also switch the flag from a bool to an enum to make it more
obvious what it means at call sites. This highlighted a couple of places
where our behavior is different between C++11 and C++14 due to switching
from checking for an ICE to checking for a converted constant
expression (where there is no 'fold' codepath).
This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
the splitting pass to be toggled on/off. The current method of passing
`-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
correctly (say, with `-O0` or `-Oz`).
To implement the -fsplit-cold-code option, an attribute is applied to
functions to indicate that they may be considered for splitting. This
removes some complexity from the old/new PM pipeline builders, and
behaves as expected when LTO is enabled.
Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
Differential Revision: https://reviews.llvm.org/D57265
Reviewed By: Aditya Kumar, Vedant Kumar
Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar
rL131311 added `asm()` support for builtin functions, but `asm()` for builtins with
specialized emitting (e.g. memcpy, various math functions) still do not work.
This patch makes these functions work for `asm()` and `#pragma redefine_extname`.
glibc uses `asm()` to redirect internal libc function calls to hidden aliases.
Limitation: such a function is a builtin in clang, but will not be recognized as
a libcall in optimization passes because Clang does not annotate the renamed
function as a libcall. In GCC -O1 or above, `abs` can be optimized out but we can't.
Additionally, we cannot redirect `__builtin_sin` to `real_sin` in the following example:
double sin(double x) asm("real_sin");
double f(double d) { return __builtin_sin(d); }
---
According to @rsmith, the following three statements cannot be simultaneously true:
(1) The frontend function foo has known, builtin semantics X.
(2) The symbol foo has known, builtin semantics X.
(3) It's not correct to lower a call to the frontend function foo to the symbol foo.
People do want (1) (if it is profitable to expand a memcpy, do it).
This also means that people do not want to add -fno-builtin-memcpy.
People do want (3): that is why they use asm("__GI_memcpy") in the first place.
So unfortunately we make a compromise by not refuting (2) (see the limitation above).
For most libcalls, there is a small loss because compilers don't synthesize them.
For the few glibc cares about, it uses `asm("memcpy = __GI_memcpy");` to make
the assembly level redirection.
(Changing function names (e.g. `__memcpy`) is a hit to ergonomics which is not acceptable).
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D88712
This reverts commits 683b308c07 and
8487bfd4e9.
We will go for a more restricted approach that does not give freedom to
everyone to change ABIs on whichever platform.
See the discussion on https://reviews.llvm.org/D85802.
Currently, `after` fails when applied to locations in macro arguments. This
change projects the subrange into a file source range and then applies `after`.
Differential Revision: https://reviews.llvm.org/D89468
Prototype the newly proposed load_lane instructions, as specified in
https://github.com/WebAssembly/simd/pull/350. Since these instructions are not
available to origin trial users on Chrome stable, make them opt-in by only
selecting them from intrinsics rather than normal ISel patterns. Since we only
need rough prototypes to measure performance right now, this commit does not
implement all the load and store patterns that would be necessary to make full
use of the offset immediate. However, the full suite of offset tests is included
to make it easy to track improvements in the future.
Since these are the first instructions to have a memarg immediate as well as an
additional immediate, the disassembler needed some additional hacks to be able
to parse them correctly. Making that code more principled is left as future
work.
Differential Revision: https://reviews.llvm.org/D89366
Using TypeSize::getFixedSize() instead of relying upon the implicit
TypeSize->uint64_cast as the type is always fixed width.
Differential Revision: https://reviews.llvm.org/D89313
Capitalize the profile function of APValue such that it can be used by FoldingSetNodeID
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D88643
Previously we failed to convert 'p' from array/function to pointer type,
and to represent the load of 'p' in the AST. The latter causes problems
for constant evaluation.
Update clang-tools-extra, clang/tools, clang/unittests to migrate from
`SourceManager::getBuffer`, which returns an always dereferenceable
`MemoryBuffer*`, to `getBufferOrNone` or `getBufferOrFake`, both of
which return a `MemoryBufferRef`, depending on whether the call site was
checking for validity of the buffer. No functionality change intended.
Differential Revision: https://reviews.llvm.org/D89416
Update clang/lib/StaticAnalyzer to stop relying on a `MemoryBuffer*`,
using the `MemoryBufferRef` from `getBufferOrNone` or the
`Optional<MemoryBufferRef>` from `getBufferOrFake`, depending on whether
there's logic for checking validity of the buffer. The change to
clang/lib/StaticAnalyzer/Core/IssueHash.cpp is potentially a
functionality change, since the logic was wrong (it checked for
`nullptr`, which was never returned by the old API), but if that was
reachable the new behaviour should be better.
Differential Revision: https://reviews.llvm.org/D89414
Update `clang/lib/CodeGen` to use a `MemoryBufferRef` from
`getBufferOrNone` instead of `MemoryBuffer*` from `getBuffer`. No
functionality change here.
Differential Revision: https://reviews.llvm.org/D89411
Update clang/lib/Frontend to use a `MemoryBufferRef` from
`getBufferOrFake` instead of `MemoryBuffer*` from `getBuffer`, with the
exception of `FrontendInputFile`, which I'm leaving for later.
Differential Revision: https://reviews.llvm.org/D89409
Update clang/lib/Basic to stop relying on a `MemoryBuffer*`, using the
`MemoryBufferRef` from `getBufferOrNone` or `getBufferOrFake` instead of
`getBuffer`.
Differential Revision: https://reviews.llvm.org/D89394
callee in constant evaluation.
We previously made a deep copy of function parameters of class type when
passing them, resulting in the destructor for the parameter applying to
the original argument value, ignoring any modifications made in the
function body. This also meant that the 'this' pointer of the function
parameter could be observed changing between the caller and the callee.
This change completely reimplements how we model function parameters
during constant evaluation. We now model them roughly as if they were
variables living in the caller, albeit with an artificially reduced
scope that covers only the duration of the function call, instead of
modeling them as temporaries in the caller that we partially "reparent"
into the callee at the point of the call. This brings some minor
diagnostic improvements, as well as significantly reduced stack usage
during constant evaluation.
AlignedCharArrayUnion is really only needed to handle the "union" case
when we need memory of suitable size and alignment for multiple types.
SmallVector only needs storage for one type, so use that directly.
The argument passed to the preprocessor macros `NS_SWIFT_NAME(x)` and
`CF_SWIFT_NAME(x)` is stringified before passing to
`__attribute__((swift_name("x")))`.
ClangFormat didn't know about this stringification, so its custom parser
tried to parse the argument(s) passed to the macro as if they were
normal function arguments.
That means ClangFormat currently incorrectly inserts whitespace
between `NS_SWIFT_NAME` arguments with colons and dots, so:
```
extern UIWindow *MainWindow(void) NS_SWIFT_NAME(getter:MyHelper.mainWindow());
```
becomes:
```
extern UIWindow *MainWindow(void) NS_SWIFT_NAME(getter : MyHelper.mainWindow());
```
which clang treats as a parser error:
```
error: 'swift_name' attribute has invalid identifier for context name [-Werror,-Wswift-name-attribute]
```
Thankfully, D82620 recently added the ability to treat specific macros
as "whitespace sensitive", meaning their arguments are implicitly
treated as strings (so whitespace is not added anywhere inside).
This diff adds `NS_SWIFT_NAME` and `CF_SWIFT_NAME` to
`WhitespaceSensitiveMacros` so their arguments are implicitly treated
as whitespace-sensitive.
Test Plan:
New tests added. Ran tests with:
% ninja FormatTests && ./tools/clang/unittests/Format/FormatTests
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D89425
Remove `ContentCache::getBuffer`, which always returned a
dereferenceable `MemoryBuffer*` and had a `bool*Invalid` out parameter,
and replace it with:
- `ContentCache::getBufferOrNone`, which returns
`Optional<MemoryBufferRef>`. This is the new API that consumers should
use. Later it could be renamed to `getBuffer`, but intentionally using
a different name to root out any unexpected callers.
- `ContentCache::getBufferPointer`, which returns `MemoryBuffer*` with
"optional" semantics. This is `private` to avoid growing callers and
`SourceManager` has temporarily been made a `friend` to access it.
Later paches will update the transitive callers to not need a raw
pointer, and eventually this will be deleted.
No functionality change intended here.
Differential Revision: https://reviews.llvm.org/D89348
This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to override the default target C++ ABI through
a compiler flag. This makes it easier to test and transition between different
C++ ABIs through compile flags rather than build flags.
In this patch:
- Store `-fc++-abi=` in a LangOpt. This isn't stored in a
CodeGenOpt because there are instances outside of codegen where Clang
needs to know what the ABI is (particularly through
ASTContext::createCXXABI), and we should be able to override the
target default if the flag is provided at that point.
- Expose the existing ABIs in TargetCXXABI as values that can be passed
through this flag.
- Create a .def file for these ABIs to make it easier to check flag
values.
- Add an error for diagnosing bad ABI flag values.
Differential Revision: https://reviews.llvm.org/D85802
clang --target arm-none-eabi --print-libgcc-file-name --rtlib=compiler-rt
used to print `/path/to/lib/clang/version/lib/libclang_rt.builtins-arm.a`
but should print `/path/to/lib/clang/version/lib/baremetal/libclang_rt.builtins-arm.a`.
Similarly, --target armv7m-none-eabi should print libclang_rt.builtins-armv7m.a
This matches the compiler-rt file name used at link time in the
baremetal driver.
Reviewed By: manojgupta
Differential Revision: https://reviews.llvm.org/D89327
Summary:
This patch does the following:
1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a
parameter, because some options' default value is triple dependant.
2. DataSections is turned on by default on AIX for llc.
3. Test cases change accordingly because of the default behaviour change.
4. Clang Driver passes in -fdata-sections by default on AIX.
Reviewed By: MaskRay, DiggerLin
Differential Revision: https://reviews.llvm.org/D88737
During the import of attributes we forgot to set the spelling list
index. This caused a segfault when we wanted to traverse the AST
(e.g. by the dump() method).
Differential Revision: https://reviews.llvm.org/D89318
During the import of FormatAttrs we forgot to import the type (e.g
`__scanf__`) of the attribute. This caused a segfault when we wanted to
traverse the AST (e.g. by the dump() method).
Differential Revision: https://reviews.llvm.org/D89319
Instead of collecting all specializations and doing a post-filterin, we
can just get all targeted specializations from getPartialSpecializationsizations.
Differential Revision: https://reviews.llvm.org/D89220
callee in constant evaluation.
We previously made a deep copy of function parameters of class type when
passing them, resulting in the destructor for the parameter applying to
the original argument value, ignoring any modifications made in the
function body. This also meant that the 'this' pointer of the function
parameter could be observed changing between the caller and the callee.
This change completely reimplements how we model function parameters
during constant evaluation. We now model them roughly as if they were
variables living in the caller, albeit with an artificially reduced
scope that covers only the duration of the function call, instead of
modeling them as temporaries in the caller that we partially "reparent"
into the callee at the point of the call. This brings some minor
diagnostic improvements, as well as significantly reduced stack usage
during constant evaluation.
callee in constant evaluation.
We previously made a deep copy of function parameters of class type when
passing them, resulting in the destructor for the parameter applying to
the original argument value, ignoring any modifications made in the
function body. This also meant that the 'this' pointer of the function
parameter could be observed changing between the caller and the callee.
This change completely reimplements how we model function parameters
during constant evaluation. We now model them roughly as if they were
variables living in the caller, albeit with an artificially reduced
scope that covers only the duration of the function call, instead of
modeling them as temporaries in the caller that we partially "reparent"
into the callee at the point of the call. This brings some minor
diagnostic improvements, as well as significantly reduced stack usage
during constant evaluation.
AIX has different layout dumping format from other itanium ABIs.
And for these two cases, use regex to match AIX format.
Differential Revision: https://reviews.llvm.org/D89064
With this change, we're more or less ready to allow users outside
of the Static Analyzer to take advantage of path diagnostic consumers
for emitting their warnings in different formats.
Differential Revision: https://reviews.llvm.org/D67422
IssueHash is an attempt to introduce stable warning identifiers
that won't change when code around them gets moved around.
Path diagnostic consumers print issue hashes for the emitted diagnostics.
This move will allow us to ultimately move path diagnostic consumers
to libAnalysis.
Differential Revision: https://reviews.llvm.org/D67421
The AnalyzerOptions object contains too much information that's
entirely specific to the Analyzer. It is also being referenced by
path diagnostic consumers to tweak their behavior. In order for path
diagnostic consumers to function separately from the analyzer,
make a smaller options object that only contains relevant options.
Differential Revision: https://reviews.llvm.org/D67420
Change EmitAsmStmt() to
- Not tie physregs with the "+r" constraint, but instead add the hard
register as an input constraint. This makes "+r" and "=r":"r" look the same
in the output.
Background: Macro intensive user code may contain inline assembly
statements with multiple operands constrained to the same physreg. Such a
case (with the operand constraints "+r" : "r") currently triggers the
TwoAddressInstructionPass assertion against any extra use of a tied
register. Furthermore, TwoAddress will insert a COPY to that physreg even
though isel has already done so (for the non-tied use), which may lead to a
second redundant instruction currently. A simple fix for this is to not
emit tied physreg uses in the first place for the "+r" constraint, which is
what this patch does.
- Give an error on multiple outputs to the same physical register.
This should be reported and this is also what GCC does.
Review: Ulrich Weigand, Aaron Ballman, Jennifer Yu, Craig Topper
Differential Revision: https://reviews.llvm.org/D87279
After D88666, which implemented DirectoryWatcher on Windows, we're
seeing test failures on Chromium's Windows bots.
Try raising the timeout in case the test is failing due to high load on
the machine.
Followup to D85191.
This changes getTypeInfoInChars to return a TypeInfoChars
struct instead of a std::pair of CharUnits. This lets the
interface match getTypeInfo more closely.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D86447
This patch resumes the work of D16586.
According to the AAPCS, volatile bit-fields should
be accessed using containers of the widht of their
declarative type. In such case:
```
struct S1 {
short a : 1;
}
```
should be accessed using load and stores of the width
(sizeof(short)), where now the compiler does only load
the minimum required width (char in this case).
However, as discussed in D16586,
that could overwrite non-volatile bit-fields, which
conflicted with C and C++ object models by creating
data race conditions that are not part of the bit-field,
e.g.
```
struct S2 {
short a;
int b : 16;
}
```
Accessing `S2.b` would also access `S2.a`.
The AAPCS Release 2020Q2
(https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=)
section 8.1 Data Types, page 36, "Volatile bit-fields -
preserving number and width of container accesses" has been
updated to avoid conflict with the C++ Memory Model.
Now it reads in the note:
```
This ABI does not place any restrictions on the access widths of bit-fields where the container
overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field
placed between two other bit-fields. This is because the C/C++ memory model defines these as being
separate memory locations, which can be accessed by two threads simultaneously. For this reason,
compilers must be permitted to use a narrower memory access width (including splitting the access into
multiple instructions) to avoid writing to a different memory location. For example, in
struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two
memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };,
writes to a or b must not overwrite each other.
```
I've updated the patch D16586 to follow such behavior by verifying that we
only change volatile bit-field access when:
- it won't overlap with any other non-bit-field member
- we only access memory inside the bounds of the record
- avoid overlapping zero-length bit-fields.
Regarding the number of memory accesses, that should be preserved, that will
be implemented by D67399.
Reviewed By: ostannard
Differential Revision: https://reviews.llvm.org/D72932
Emit the equivalent integer reduction intrinsics in IR instead of expanding to shuffle+arithmetic sequences.
The fadd/fmul reductions might be trickier as they assume a similar bisection reduction while the generic intrinsics assume a sequential reduction (intel docs are ambiguous on the correct approach) - I'm not sure if we want to always tag them with reassoc? Anyway, that issue can wait until a separate fp patch along with the fmin/fmax reductions.
Differential Revision: https://reviews.llvm.org/D87604
References to different declarations of the same entity aren't different
values, so shouldn't have different representations.
Recommit of e6393ee813, most recently
reverted in 9a33f027ac due to a bug caused
by ObjCInterfaceDecls not propagating availability attributes along
their redeclaration chains; that bug was fixed in
e2d4174e9c.
chain for ObjCInterfaceDecls.
Only one such declaration can actually have attributes (the definition,
if any), but generally we assume that we can look for InheritedAttrs on
the most recent declaration.
They can get stale at use time because of updates from other recursive
specializations. Instead, rely on the existence of previous declarations to add
the specialization.
Differential Revision: https://reviews.llvm.org/D87853
Some tests start to fail after https://reviews.llvm.org/D89066.
It's because the size of pointers are different on different targets.
Limit the target in the command so there is no confusion.
Also noticed I had typo in the test name.
Adding disable-llvm-passes option to make the test more stable as well.
Differential Revision: https://reviews.llvm.org/D89269
This is a prep patch for changing SourceManager to return
`Optional<MemoryBufferRef>` instead of `MemoryBuffer`. With that change the
address of the MemoryBuffer will be gone, so instead use the start of the
buffer as the key for this map.
No functionality change intended, as it's expected that the pointer identity
matches between the buffers and the buffer data.
Radar-Id: rdar://70139990
Differential Revision: https://reviews.llvm.org/D89136
See PR47804:
TreeTransform uses TransformedLocalDecls as a map of declarations that
have been transformed already. When doing a "TransformDecl", which
happens in the cases of updating a DeclRefExpr's target, the default
implementation simply returns the already transformed declaration.
However, this was not including ParmVarDecls. SO, any use of
TreeTransform that didn't re-implement TransformDecl would NOT properly
update the target of a DeclRefExpr, resulting in odd behavior.
In the case of Typo-recovery, the result was that a lambda that used its
own parameter would cause an error, since it thought that the
ParmVarDecl referenced was a different lambda. Additionally, this caused
a problem in the AST (a declrefexpr into another scope) such that a
future instantiation would cause an assertion.
This patch ensures that the ParmVarDecl transforming process records
into TransformedLocalDecls so that the DeclRefExpr is ALSO updated.
In https://reviews.llvm.org/D87470 I added the change to tighten the lifetime of the expression awaiter.await_suspend().address.
Howver it was incorrect. ExprWithCleanups will call the dtor and end the lifetime for all the temps created in the current full expr.
When this is called on a normal await call, we don't want to do that.
We only want to do this for the call on the final_awaiter, to avoid writing into the frame after the frame is destroyed.
This change fixes it, by checking IsImplicit.
Differential Revision: https://reviews.llvm.org/D89066
Jeremy Morse discovered an issue with the lit test introduced in D88363. The
test gives different results for Sony's `-O1`.
The test needs to run at `-O1` otherwise the likelihood attribute will be
ignored. Instead of running all `-O1` passes it only runs the lower-expect pass
which is needed to lower `__builtin_expect`.
Differential Revision: https://reviews.llvm.org/D89204
The dependent mechanism for C error-recovery is mostly finished,
this is the only place we have missed.
Differential Revision: https://reviews.llvm.org/D89045
Given the following VarTemplateDecl AST,
```
VarTemplateDecl col:26 X
|-TemplateTypeParmDecl typename depth 0 index 0
`-VarDecl X 'bool' cinit
`-CXXBoolLiteralExpr 'bool' true
```
previously, we returned the VarDecl as the top-level decl, which was not
correct, the top-level decl should be VarTemplateDecl.
Differential Revision: https://reviews.llvm.org/D89098
This reverts commit 849c60541b because it
results in a stage 2 build failure:
llvm-project/clang/include/clang/AST/ExternalASTSource.h:409:20: error:
definition with same mangled name
'_ZN5clang25LazyGenerationalUpdatePtrIPKNS_4DeclEPS1_XadL_ZNS_17ExternalASTSource19CompleteRedeclChainES3_EEE9makeValueERKNS_10ASTContextES4_'
as another definition
static ValueType makeValue(const ASTContext &Ctx, T Value);
parameter in its notion of template argument identity.
We already did this for all the other kinds of non-type template
argument. We're still missing the type from the mangling, so we continue
to be able to see collisions at link time; that's an open ABI issue.
And another step towards transforms not introducing inttoptr and/or
ptrtoint casts that weren't there already.
As we've been establishing (see D88788/D88789), if there is a int<->ptr cast,
it basically must stay as-is, we can't do much with it.
I've looked, and the most source of new such casts being introduces,
as far as i can tell, is this transform, which, ironically,
tries to reduce count of casts..
On vanilla llvm test-suite + RawSpeed, @ `-O3`, this results in
-33.58% less `IntToPtr`s (19014 -> 12629)
and +76.20% more `PtrToInt`s (18589 -> 32753),
which is an increase of +20.69% in total.
However just on RawSpeed, where i know there are basically
none `IntToPtr` in the original source code,
this results in -99.27% less `IntToPtr`s (2724 -> 20)
and +82.92% more `PtrToInt`s (4513 -> 8255).
which is again an increase of 14.34% in total.
To me this does seem like the step in the right direction,
we end up with strictly less `IntToPtr`, but strictly more `PtrToInt`,
which seems like a reasonable trade-off.
See https://reviews.llvm.org/D88860 / https://reviews.llvm.org/D88995
for some more discussion on the subject.
(Eventually, `CastInst::isNoopCast()`/`CastInst::isEliminableCastPair`
should be taught about this, yes)
Reviewed By: nlopes, nikic
Differential Revision: https://reviews.llvm.org/D88979
GCC 11 will define this macro.
In LLVM, the feature flag only applies to 64-bit mode and we always define the
macro in 32-bit mode. This is different from GCC -m32 in which -mno-sahf can
suppress the macro. The discrepancy can unlikely cause trouble.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D89198
At AMD, in an internal audit of our code, we found some corner cases
where we were not quite differentiating targets enough for some old
hardware. This commit is part of fixing that by adding three new
targets:
* The "Oland" and "Hainan" variants of gfx601 are now split out into
gfx602. LLPC (in the GPUOpen driver) and other front-ends could use
that to avoid using the shaderZExport workaround on gfx602.
* One variant of gfx703 is now split out into gfx705. LLPC and other
front-ends could use that to avoid using the
shaderSpiCsRegAllocFragmentation workaround on gfx705.
* The "TongaPro" variant of gfx802 is now split out into gfx805.
TongaPro has a faster 64-bit shift than its former friends in gfx802,
and a subtarget feature could be set up for that to take advantage of
it. This commit does not make that change; it just adds the target.
V2: Add clang changes. Put TargetParser list in order.
V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order,
so fix the GPUKind order.
Differential Revision: https://reviews.llvm.org/D88916
Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
This implements the directory watcher on Windows. It does the most
naive thing for simplicity. ReadDirectoryChangesW is used to monitor
the changes. However, in order to support interruption, we must use
overlapped IO, which allows us to use the blocking, synchronous
mechanism. We create a thread to post the notification to the consumer
to allow the monitoring to continue. The two threads communicate via a
locked queue.
Differential Revision: https://reviews.llvm.org/D88666
Reviewed By: Adrian McCarthy
There doesn't seem to be a direct test of this, and I'm planning to make
future changes which will affect it.
I'm not particularly familiar with the blocks extension, so suggestions
for better tests are welcome.
Differential Revision: https://reviews.llvm.org/D88754
Currently, Clang looks for libc++ headers alongside the installation
directory of Clang, and it also adds a search path for headers in the
-isysroot. This is problematic if headers are found in both the toolchain
and in the sysroot, since #include_next will end up finding the libc++
headers in the sysroot instead of the intended system headers.
This patch changes the logic such that if the toolchain contains libc++
headers, no C++ header paths are added in the sysroot. However, if the
toolchain does *not* contain libc++ headers, the sysroot is searched as
usual.
This should not be a breaking change, since any code that previously
relied on some libc++ headers being found in the sysroot suffered from
the #include_next issue described above, which renders any libc++ header
basically useless.
Differential Revision: https://reviews.llvm.org/D89001
Extended -cl-std/std flag with CL3.0 and added predefined version macros.
Patch by Anton Zabaznov (azabaznov)!
Tags: #clang
Differential Revision: https://reviews.llvm.org/D88300
z/OS defaults to 16 bytes for __attribute__((aligned)), modify the test to differentiate between z/OS and Linux on s390x.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D89127
The Callbacks.cpp test was taking a long time to compile on some build bots
causing timeouts. This patch splits up that test into five separate cpp
files and a header file.
Reviewed By: gribozavr2
Differential Revision: https://reviews.llvm.org/D88886
This patch extracts the ExprMutAnalyzer changes from https://reviews.llvm.org/D54943
into its own revision for simpler review and more atomic changes.
The analysis results are improved. Nested expressions (e.g. conditional
operators) are now detected properly. Some edge cases, especially
template induced imprecisions are improved upon.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D88088
For example:
union M256 {
double d;
__m256 m;
};
extern void foo1(union M256 A);
union M256 m1;
void test() {
foo1(m1);
}
clang will pass m1 through stack which does not follow the ABI.
Differential Revision: https://reviews.llvm.org/D78699
ObjCContainerDecl.getMethod returns a nullptr by default when the
container is a hidden prototype. Callsites where the method is being
looked up on the redeclaration's own container should skip this check
since they (rightly) expect a valid method to be found.
Resolves rdar://69444243
Reviewed By: akyrtzi
Differential Revision: https://reviews.llvm.org/D89024
Previously, when clang was compiled with -DLLVM_ENABLE_ASSERTIONS=ON, the added tests were displaying:
inlinable function call in a function with debug info must have a !dbg location
call void @"??1?$c@UB@@@@QEAA@XZ"(%struct.c* @"?f@?1??d@@YAPEAU?$c@UB@@@@XZ@4U2@A")
fatal error: error in backend: Broken module found, compilation aborted!
Stack dump:
0. Program arguments: <f:\svn\buildninja\bin\clang -cc1 -emit-llvm debug-info-no-location.cpp> -gcodeview -debug-info-kind=limited
1. <eof> parser at end of file
2. Per-function optimization
Fixes PR43012
Differential Revision: https://reviews.llvm.org/D66328
types.
Previously, a type-dependent cast to a deduced class template
specialization type would end up with a non-dependent class template
specialization type, leading to confusion downstream.
Move it as an EP callback (-O[123]) or in addSanitizersAtO0.
This makes it not run in ThinLTO pre-link (like the other sanitizers),
so don't check LTO runs in hwasan-new-pm.c. Changing its position also
seems to change the generated IR. I think we just need to make sure the
pass runs.
Reviewed By: leonardchan
Differential Revision: https://reviews.llvm.org/D88936
Summary:
Replace the OpenMP Runtime Library functions used in CGOpenMPRuntimeGPU
for OpenMP device code generation with ones in OMPKinds.def and use
OMPIRBuilder for generating runtime calls. This allows us to
consolidate more OpenMP code generation into the OMPIRBuilder. Future
additions to the GPU runtime functions should now go in OMPKinds.def
Reviewers: jdoerfert
Subscribers: aaron.ballman cfe-commits guansong llvm-commits sstefan1 yaxunl
Tags: #OpenMP #LLVM #clang
Differential Revision: https://reviews.llvm.org/D88430
Summary:
This patch changes the CMake files for Clang and Libomptarget to query the
system for its supported CUDA architecture. This makes it much easier for the
user to build optimal code without needing to set the flags manually. This
relies on the now deprecated FindCUDA method in CMake, but full support for
architecture detection is only availible in CMake >3.18
Reviewers: jdoerfert ye-luo
Subscribers: cfe-commits guansong mgorny openmp-commits sstefan1 yaxunl
Tags: #clang #OpenMP
Differential Revision: https://reviews.llvm.org/D87946
Patch VisitCXXDeleteExpr() in clang::UsedDeclVisitor to avoid it crashing
when the expression's destroyed type is null. According to the comments
in CXXDeleteExpr::getDestroyedType(), this can happen when the type to
delete is a dependent type.
Patch by Geoff Levner.
Differential Revision: https://reviews.llvm.org/D88949
SUMMARY:
In IBM compiler xlclang , there is an option -fnovisibility which suppresses visibility. For more details see: https://www.ibm.com/support/knowledgecenter/SSGH3R_16.1.0/com.ibm.xlcpp161.aix.doc/compiler_ref/opt_visibility.html.
We need to add the option -mignore-xcoff-visibility for compatibility with the IBM AIX OS (as the option is enabled by default in AIX). With this option llvm does not emit any visibility attribute to ASM or XCOFF object file.
The option only work on the AIX OS, for other non-AIX OS using the option will report an unsupported options error.
In AIX OS:
1.1 the option -mignore-xcoff-visibility is enabled by default , if there is not -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command .
1.2 if there is -fvisibility=* explicitly but not -mignore-xcoff-visibility explicitly in the clang command. it will generate visibility attributes.
1.3 if there are both -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command. The option "-mignore-xcoff-visibility" wins , it do not emit the visibility attribute.
The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR.
Reviewer: daltenty,Jason Liu
Differential Revision: https://reviews.llvm.org/D87451
Summary:
This patch adds an error to Clang that detects if OpenMP offloading is used
between two architectures with incompatible pointer sizes. This ensures that
the data mapping can be done correctly and solves an issue in code generation
generating the wrong size pointer.
Reviewer: jdoerfert
Subscribers: cfe-commits delcypher guansong llvm-commits sstefan1 yaxunl
Tags: #OpenMP #Clang
Differential Revision: https://reviews.llvm.org/D88594
Object of class `Command` contains various properties of a command to
execute, but output file was missed from them. This change adds this
property. It is required for reporting consumed time and memory implemented
in D78903 and may be used in other cases too.
Differential Revision: https://reviews.llvm.org/D78902
Have the build work out of the box by forcing an LLD build.
That way, we don't require an external LTO-aware linker,
as we build one.
Also remove reference to the seemingly dead builder.
Differential Revision: https://reviews.llvm.org/D88990
While debugging a different clang-format failure, I tried to reuse the
MacroExpander lexer, but was surprised to see that it marks all C++
keywords (e.g. const, decltype) as being of type identifier. After stepping
through the ::format() code, I noticed that the difference between these
two is that the identifier table was not being initialized based on the
FormatStyle, so only basic tokens such as tok::semi, tok::plus, etc. were
being handled.
Reviewed By: klimek
Differential Revision: https://reviews.llvm.org/D88952
This improves the debugging experience since LLDB will print the enumerator
name instead of a decimal number. This changes TokenType to have uint8_t
as the underlying type and moves it after the remaining bitfields to avoid
increasing the size of FormatToken.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D87006
This reapplies D88384 with the minor modification that an assertion was
changed to a regular conditional and graceful exit from
ASTContext::mergeTypes.
Ensure that we evaluate assignment and compound-assignment
right-to-left, and array subscripting left-to-right.
Fixes PR47724.
This is a re-commit of ded79be, reverted in 37c74df, with a fix and test
for the crasher bug previously introduced.
Set the default alignment control variables for z/OS target and add test case for alignment rules on z/OS.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D88845
Separate __clang_hip_math.h header into __clang_hip_cmath.h
and __clang_hip_math.h. Improve the math function definition,
and add missing definitions or declarations. Add missing
overloads.
Reviewed By: tra, JonChesterfield
Differential Review: https://reviews.llvm.org/D88837
A lot of our code building with clang-cl.exe using Clang 11 was failing with
the following 2 type of errors:
1. explicit specialization of 'foo' after instantiation
2. no matching function for call to 'bar'
Note that we also use -fdelayed-template-parsing in our builds.
I tried pretty hard to get a small repro for these failures, but couldn't. So
there is some subtle edge case in the -fpch-instantiate-templates feature
introduced by this change: https://reviews.llvm.org/D69585
When I tried turning this off using -fno-pch-instantiate-templates, builds
would silently fail with the same error without any indication that
-fno-pch-instantiate-templates was being ignored by the compiler. Then I
realized this "no" option wasn't actually working when I ran Clang under a
debugger.
Differential revision: https://reviews.llvm.org/D88680
(it was introduced in https://lists.llvm.org/pipermail/llvm-dev/2015-January/080956.html)
This canonicalization seems dubious.
Most importantly, while it does not create `inttoptr` casts by itself,
it may cause them to appear later, see e.g. D88788.
I think it's pretty obvious that it is an undesirable outcome,
by now we've established that seemingly no-op `inttoptr`/`ptrtoint` casts
are not no-op, and are no longer eager to look past them.
Which e.g. means that given
```
%a = load i32
%b = inttoptr %a
%c = inttoptr %a
```
we likely won't be able to tell that `%b` and `%c` is the same thing.
As we can see in D88789 / D88788 / D88806 / D75505,
we can't really teach SCEV about this (not without the https://bugs.llvm.org/show_bug.cgi?id=47592 at least)
And we can't recover the situation post-inlining in instcombine.
So it really does look like this fold is actively breaking
otherwise-good IR, in a way that is not recoverable.
And that means, this fold isn't helpful in exposing the passes
that are otherwise unaware of these patterns it produces.
Thusly, i propose to simply not perform such a canonicalization.
The original motivational RFC does not state what larger problem
that canonicalization was trying to solve, so i'm not sure
how this plays out in the larger picture.
On vanilla llvm test-suite + RawSpeed, this results in
increase of asm instructions and final object size by ~+0.05%
decreases final count of bitcasts by -4.79% (-28990),
ptrtoint casts by -15.41% (-3423),
and of inttoptr casts by -25.59% (-6919, *sic*).
Overall, there's -0.04% less IR blocks, -0.39% instructions.
See https://bugs.llvm.org/show_bug.cgi?id=47592
Differential Revision: https://reviews.llvm.org/D88789
D17779: host-side shadow variables of external declarations of device-side
global variables have internal linkage and are referenced by
`__cuda_register_globals`.
nvcc from CUDA 11 does not allow `__device__ inline` or `__device__ constexpr`
(C++17 inline variables) but clang has incorrectly supported them for a while:
```
error: A __device__ variable cannot be marked constexpr
error: An inline __device__/__constant__/__managed__ variable must have internal linkage when the program is compiled in whole program mode (-rdc=false)
```
If such a variable (which has a comdat group) is discarded (a copy from another
translation unit is prevailing and selected), accessing the variable from
outside the section group (`__cuda_register_globals`) is a violation of the ELF
specification and will be rejected by linkers:
> A symbol table entry with STB_LOCAL binding that is defined relative to one of a group's sections, and that is contained in a symbol table section that is not part of the group, must be discarded if the group members are discarded. References to this symbol table entry from outside the group are not allowed.
As a workaround, don't register such inline variables for now.
(If we register the variables in all TUs, we will keep multiple instances of the shadow and break the C++ semantics for inline variables).
We should reject such variables in Sema but our internal users need some time to migrate.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D88786
API Notes are a feature which allows annotation of headers by an
auxiliary file that contains metadata for declarations pertaining to the
associated module. This enables adding attributes to declarations
without requiring modification of the headers, enabling finer grained
control for library headers for consumers without having to modify
external headers.
Differential Revision: https://reviews.llvm.org/D88446
Reviewed By: Richard Smith, Marcel Hlopko
Currently Flang uses TextDiagnostic, TextDiagnosticPrinter &
TestDiagnosticBuffer classes from Clang (more specifically, from
libclangFrontend). This patch introduces simplified equivalents of these
classes in Flang (i.e. it removes the dependency on libclangFrontend).
Flang only needs these diagnostics classes for the compiler driver
diagnostics. This is unlike in Clang in which similar diagnostic classes
are used for e.g. Lexing/Parsing/Sema diagnostics. For this reason, the
implementations introduced here are relatively basic. We can extend them
in the future if this is required.
This patch also enhances how the diagnostics are printed. In particular,
this is the diagnostic that you'd get _before_ the changes introduced here
(no text formatting):
```
$ bin/flang-new
error: no input files
```
This is the diagnostic that you get _after_ the changes introduced here
(in terminals that support it, the text is formatted - bold + red):
```
$ bin/flang-new
flang-new: error: no input files
```
Tests are updated accordingly and options related to enabling/disabling
color diagnostics are flagged as supported by Flang.
Reviewed By: sameeranjoshi, CarolineConcatto
Differential Revision: https://reviews.llvm.org/D87774
Summary:
This patch adds an error to Clang that detects if OpenMP offloading is
used between two architectures with incompatible pointer sizes. This
ensures that the data mapping can be done correctly and solves an issue
in code generation generating the wrong size pointer. This patch adds a
new lit substitution, %omp_powerpc_triple that, if the system is 32-bit or
64-bit, sets the powerpc triple accordingly. This was required to fix
some OpenMP tests that automatically populated the target architecture.
Reviewers: jdoerfert
Subscribers: cfe-commits guansong sstefan1 yaxunl delcypher
Tags: OpenMP clang LLVM
Differential Revision: https://reviews.llvm.org/D88594
The error-bit was missing, if a DeclRefExpr (which refers to a VarDecl
with a contains-errors initializer).
It could cause different violations in clang -- the DeclRefExpr is value-dependent,
but not contains-errors, `ABC<DeclRefExpr>` could produce a non-error
and non-dependent type in non-template context, which will lead to
crashes in constexpr evaluation.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D86048
By convention the default output file for -E is "-" (stdout).
This is expected by tools like ccache, which uses output
of -E to determine if a file and its dependence has changed.
Currently clang does not use stdout as default output file for -E
for HIP, which causes ccache not working.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D88730
Add an option --gpu-instrument-lib= to allow users to specify
an instrument device library. This is for supporting -finstrument
in device code for debugging/profiling tools.
Differential Revision: https://reviews.llvm.org/D88557
This is one of the reason for extra invalidations in D84959. In
practice, I don't think we have use cases needing this. This simplifies
the pipeline a bit and prune corner cases when considering
invalidations.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D85676
We were taking multiple pointer arguments in the builtin.
gcc accepts a single void*.
The cast from void* to _m128i* caused the IR generation to assume
the pointer was aligned.
Instead make the builtin take a single void*, emit i8* GEPs to
adjust then cast to <2 x i64>* and perform a store with align of 1.
Summary: This patch implements the builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp.
The instructions correspond to the following builtins:
int vec_test_swdiv(vector double v1, vector double v2);
int vec_test_swdivs(vector float v1, vector float v2);
int vec_test_swsqrt(vector double v1);
int vec_test_swsqrts(vector float v1);
This patch depends on D88274, which fixes the bug in copying from CRRC to GPRC/G8RC.
Reviewed By: steven.zhang, amyk
Differential Revision: https://reviews.llvm.org/D88278
Bruno De Fraine discovered some issues with D85091. The branch weights
generated for `logical not` and `ternary conditional` were wrong. The
`logical and` and `logical or` differed from the code generated of
`__builtin_predict`.
Adjusted the generated code for the likelihood to match
`__builtin_predict`. The patch is based on Bruno's suggestions.
Differential Revision: https://reviews.llvm.org/D88363
The function `TryListConversion` didn't properly validate the following
part of the standard:
Otherwise, if the parameter type is a character array [... ]
and the initializer list has a single element that is an
appropriately-typed string literal (8.5.2 [dcl.init.string]), the
implicit conversion sequence is the identity conversion.
This caused the following call to `f()` to be ambiguous.
void f(int(&&)[1]);
void f(unsigned(&&)[1]);
void g(unsigned i) {
f({i});
}
This issue only occurs when the initializer list had one element.
Differential Revision: https://reviews.llvm.org/D87561
This helper method is useful even outside of Gnu toolchains, so move
it to ToolChain so it can be reused in other toolchains such as Fuchsia.
Differential Revision: https://reviews.llvm.org/D88452
AMDGPU toolchain currently only diagnose invalid target ID for OpenCL
source compilation. Invalid target ID is not diagnosed for assembler.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D88377
Currently CUDA/HIP toolchain uses "unknown" as bound arch
for offload action for fat binary. This causes -mcpu or -march
with "unknown" added in HIPToolChain::TranslateArgs or
CUDAToolChain::TranslateArgs.
This causes issue for https://reviews.llvm.org/D88377 since
HIP toolchain needs to check -mcpu in HIPToolChain::TranslateArgs.
The bound arch of offload action for fat binary is not really
used, therefore set it to CudaArch::UNUSED.
Differential Revision: https://reviews.llvm.org/D88524
We now recognize this function as a builtin despite it having an
unexpected number of parameters; make sure we don't enforce that it has
only 1 argument for its 2 parameters.
To facilitate faster loading of device binaries and share them among processes,
HIP runtime favors their alignment being 4096 bytes. HIP runtime can load
unaligned device binaries, however, aligning them at 4096 bytes results in
faster loading and less shared memory usage.
This patch adds an option -bundle-align to clang-offload-bundler which allows
bundles to be aligned at specified alignment. By default it is 1, which is NFC
compared to existing format.
This patch then aligns embedded fat binary and device binary inside fat binary
at 4096 bytes.
It has been verified this change does not cause significant overall file size increase
for typical HIP applications (less than 1%).
Differential Revision: https://reviews.llvm.org/D88734
Summary:
Motivated by the new objc_direct attribute, this change adds a new
attribute that remotes metadata from Protocols that the programmer knows
isn't going to be used at runtime. We simply have the frontend skip
generating any protocol metadata entries (e.g. OBJC_CLASS_NAME,
_OBJC_$_PROTOCOL_INSTANCE_METHDOS, _OBJC_PROTOCOL, etc) for a protocol
marked with `__attribute__((objc_non_runtime_protocol))`.
There are a few APIs used to retrieve a protocol at runtime.
`@protocol(SomeProtocol)` will now error out of the requested protocol
is marked with attribute. `objc_getProtocol` will return `NULL` which
is consistent with the behavior of a non-existing protocol.
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75574
This helper method is useful even outside of Gnu toolchains, so move
it to ToolChain so it can be reused in other toolchains such as Fuchsia.
Differential Revision: https://reviews.llvm.org/D88452
Expand the list of targets that support cfi-icall.
Add ThinLTO everywhere LTO is mentioned. AFAIK all CFI features are
supported with ThinLTO.
Differential Revision: https://reviews.llvm.org/D87717
This adds support for -mcpu=cortex-r82. Some more information about this
core can be found here:
https://www.arm.com/products/silicon-ip-cpu/cortex-r/cortex-r82
One note about the system register: that is a bit of a refactoring because of
small differences between v8.4-A AArch64 and v8-R AArch64.
This is based on patches from Mark Murray and Mikhail Maltsev.
Differential Revision: https://reviews.llvm.org/D88660
Fix premature decision in the presence of type-dependent expression
operands on whether AltiVec vector initializations from single
expressions are "splat" operations.
Verify that the instantiation is able to determine the correct cast
semantics for both the scalar type and the vector type case.
Note that, because the change only affects the single-expression
case (and the target type is an AltiVec-style vector type), the
replacement of a parenthesized list with a parenthesized expression
does not change the semantics of the program in a program-observable
manner.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D88526
We prefer autodetection here to avoid persisting this configuration
in the generated __config header which is shared across targets.
Differential Revision: https://reviews.llvm.org/D88694
This is an alternate fix (see D87835) for a bug where a NaN constant
gets wrongly transformed into Infinity via truncation.
In this patch, we uniformly convert any SNaN to QNaN while raising
'invalid op'.
But we don't have a way to directly specify a 32-bit SNaN value in LLVM IR,
so those are always encoded/decoded by calling convert from/to 64-bit hex.
See D88664 for a clang fix needed to allow this change.
Differential Revision: https://reviews.llvm.org/D88238