Add methods for loop unrolling to the OpenMPIRBuilder class and use them in Clang if `-fopenmp-enable-irbuilder` is enabled. The unrolling methods are:
* `unrollLoopFull`
* `unrollLoopPartial`
* `unrollLoopHeuristic`
`unrollLoopPartial` and `unrollLoopHeuristic` can use compiler heuristics to automatically determine the unroll factor. If possible, that is if no CanonicalLoopInfo is required to pass to another method, metadata for LLVM's LoopUnrollPass is added. Otherwise the unroll factor is determined using the same heurstics as user by LoopUnrollPass. Not requiring a CanonicalLoopInfo, especially with `unrollLoopHeuristic` allows greater flexibility.
With full unrolling and partial unrolling with known unroll factor, instead of duplicating instructions by the OpenMPIRBuilder, the full unroll is still delegated to the LoopUnrollPass. In case of partial unrolling the loop is first tiled using the existing `tileLoops` methods, then the inner loop fully unrolled using the same mechanism.
Reviewed By: jdoerfert, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D107764
Per the contract of ReadLateParsedTemplates, we should not be returning
the same results multiple times. No functionality change intended, other
than to runtime.
Thanks to Luboš Luňák for identifying the cause of the regression!
There is a separate field `isPragmaOnce` and when `isImport` combines
both, it complicates HeaderFileInfo serialization as `#pragma once` is
the inherent property of the header while `isImport` reflects how other
headers use it. The usage of the header can be different in different
contexts, that's why `isImport` requires tracking separate from `#pragma once`.
Differential Revision: https://reviews.llvm.org/D104351
The commandline flag to specify a particular openmp devicertl library
currently errors like:
```
fatal error: cannot open file
'./runtimes/runtimes-bins/openmp/libomptarget':
Is a directory
```
CommonArgs successfully appends the directory to the commandline args then
mlink-builtin-bitcode rejects it.
This patch is a point fix to that. If --libomptarget-amdgcn-bc-path=directory
then append the expected name for the current architecture and go on as before.
This is useful for test runners that don't hardcode the architecture.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D109057
Document 'use-external-name' and the various bits of logic that make it
work, to avoid others having to repeat the archival work (given that I
added getFileRefReturnsCorrectNameForDifferentStatPath to
FileManagerTest, seems possible I understood this once before!).
- b59cf679e8 added 'use-external-name' to
RedirectingFileSystem. This causes `stat`s to return the external
name for a redirected file instead of the name it was accessed by,
leaking it through the VFS.
- d066d4c849 propagated the external name
further through clang::FileManager.
- 4dc5573acc, which added
clang::FileEntryRef to clang::FileManager, has complicated concession
to account for this as well (since refactored a bit).
The goal of 'use-external-name' is to enable Clang to report "real" file
paths to users (via diagnostics) and to external tools (such as
debuggers reading debug info and build systems reading `.d` files).
I've added FIXMEs to look at other channels for communicating the
external names, since the current implementation adds complexity to
FileManager and exposes an inconsistent interface to clients.
Besides that, the FileManager logic appears to be kicking in outside of
'use-external-name'. Seems that *some* vfs::FileSystem implementations
canonicalize some paths returned by `stat` in *some* cases (the bug
isn't fully understood yet). Volodymyr Sapsai is investigating, this at
least better documents what *is* understood.
Given D109057, change test runner to use the libomptarget-x-bc-path
argument instead of the LIBRARY_PATH environment variable to find the device
library.
Also drop the use of LIBRARY_PATH environment variable as it is far
too easy to pull in the device library from an unrelated toolchain by accident
with the current setup. No loss in flexibility to developers as the clang
commandline used here is still available.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D109061
The commandline flag to specify a particular openmp devicertl library
currently errors like:
```
fatal error: cannot open file
'./runtimes/runtimes-bins/openmp/libomptarget':
Is a directory
```
CommonArgs successfully appends the directory to the commandline args then
mlink-builtin-bitcode rejects it.
This patch is a point fix to that. If --libomptarget-amdgcn-bc-path=directory
then append the expected name for the current architecture and go on as before.
This is useful for test runners that don't hardcode the architecture.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D109057
Discovered in SYCL, the field annotations were always cast to an i8*,
which is an invalid bitcast for a pointer type with an address space.
This patch makes sure that we create an intrinsic that takes a pointer
to the correct address-space and properly do our casts.
Differential Revision: https://reviews.llvm.org/D109003
The intent of this patch is to add support of -fp-model=[source|double|extended] to allow
the compiler to use a wider type for intermediate floating point calculations. As a side
effect to that, the value of FLT_EVAL_METHOD is changed according to the pragma
float_control.
Unfortunately some issue was uncovered with this change in preprocessing. See details in
https://reviews.llvm.org/D93769 . We are therefore reverting this patch until we find a way
to reconcile the value of FLT_EVAL_METHOD, the pragma and the -E flow.
This reverts commit 66ddac22e2.
Original commit message:"
The current infrastructure in lib/Interpreter has a tool, clang-repl, very
similar to clang-interpreter which also allows incremental compilation.
This patch moves clang-interpreter as a test case and drops it as conditionally
built example as we already have clang-repl in place.
Differential revision: https://reviews.llvm.org/D107049
"
This patch also ignores ppc due to missing weak symbol for __gxx_personality_v0
which may be a feature request for the jit infrastructure. Also, adds a missing
build system dependency to the orc jit.
Modifies OpenCL 3.0 optional core feature macro definitions so that
they are set analogously in C++ for OpenCL 2021.
This change aims to achieve compatibility between C++ for OpenCL
2021 and OpenCL 3.0.
Differential Revision: https://reviews.llvm.org/D108704
The current infrastructure in lib/Interpreter has a tool, clang-repl, very
similar to clang-interpreter which also allows incremental compilation.
This patch moves clang-interpreter as a test case and drops it as conditionally
built example as we already have clang-repl in place.
Differential revision: https://reviews.llvm.org/D107049
This patch implements Clang support for an original OpenMP extension
we have developed to support OpenACC: the `ompx_hold` map type
modifier. The next patch in this series, D106510, implements OpenMP
runtime support.
Consider the following example:
```
#pragma omp target data map(ompx_hold, tofrom: x) // holds onto mapping of x
{
foo(); // might have map(delete: x)
#pragma omp target map(present, alloc: x) // x is guaranteed to be present
printf("%d\n", x);
}
```
The `ompx_hold` map type modifier above specifies that the `target
data` directive holds onto the mapping for `x` throughout the
associated region regardless of any `target exit data` directives
executed during the call to `foo`. Thus, the presence assertion for
`x` at the enclosed `target` construct cannot fail. (As usual, the
standard OpenMP reference count for `x` must also reach zero before
the data is unmapped.)
Justification for inclusion in Clang and LLVM's OpenMP runtime:
* The `ompx_hold` modifier supports OpenACC functionality (structured
reference count) that cannot be achieved in standard OpenMP, as of
5.1.
* The runtime implementation for `ompx_hold` (next patch) will thus be
used by Flang's OpenACC support.
* The Clang implementation for `ompx_hold` (this patch) as well as the
runtime implementation are required for the Clang OpenACC support
being developed as part of the ECP Clacc project, which translates
OpenACC to OpenMP at the directive AST level. These patches are the
first step in upstreaming OpenACC functionality from Clacc.
* The Clang implementation for `ompx_hold` is also used by the tests
in the runtime implementation. That syntactic support makes the
tests more readable than low-level runtime calls can. Moreover,
upstream Flang and Clang do not yet support OpenACC syntax
sufficiently for writing the tests.
* More generally, the Clang implementation enables a clean separation
of concerns between OpenACC and OpenMP development in LLVM. That
is, LLVM's OpenMP developers can discuss, modify, and debug LLVM's
extended OpenMP implementation and test suite without directly
considering OpenACC's language and execution model, which can be
handled by LLVM's OpenACC developers.
* OpenMP users might find the `ompx_hold` modifier useful, as in the
above example.
See new documentation introduced by this patch in `openmp/docs` for
more detail on the functionality of this extension and its
relationship with OpenACC. For example, it explains how the runtime
must support two reference counts, as specified by OpenACC.
Clang recognizes `ompx_hold` unless `-fno-openmp-extensions`, a new
command-line option introduced by this patch, is specified.
Reviewed By: ABataev, jdoerfert, protze.joachim, grokos
Differential Revision: https://reviews.llvm.org/D106509
This change defines a helper function getOpenCLCompatibleVersion()
inside LangOptions class. The function contains mapping between
C++ for OpenCL versions and their corresponding compatible OpenCL
versions. This mapping function should be updated each time a new
C++ for OpenCL language version is introduced. The helper function
is expected to simplify conditions on OpenCL C and C++ for OpenCL
versions inside compiler code.
Code refactoring performed.
Differential Revision: https://reviews.llvm.org/D108693
Streamline template arguments across types, variables, and functions -
for convenient reuse in experiments related to template argument list
reconstitution (not including template argument lists in the "name" of
those entities, and leaving it to debug info consumers to rebuild the
full template name from the semantic descriptions of the argument lists)
But the change seems like a good refactoring/cleanup anyway.
I'd certainly be open to suggestions about how this might be more
streamlined - like is there no generic way to query template argument
lists across the 3 kinds of entities, rather than needing special case
code?
When deserializing a RecordDecl we don't enforce that redeclaration
chain contains only a single definition. So if the canonical decl is not
a definition itself, `RecordType::getDecl` can return different objects
before and after an include. It means we can build CGRecordLayout for
one RecordDecl with its set of FieldDecl but try to use it with
FieldDecl belonging to a different RecordDecl. With assertions enabled
it results in
> Assertion failed: (FieldInfo.count(FD) && "Invalid field for record!"),
> function getLLVMFieldNo, file llvm-project/clang/lib/CodeGen/CGRecordLayout.h, line 199.
and with assertions disabled a bunch of fields are treated as their
memory is located at offset 0.
Fix by keeping the first encountered RecordDecl definition and marking
the subsequent ones as non-definitions. Also need to merge FieldDecl
properly, so that `getPrimaryMergedDecl` works correctly and during name
lookup we don't treat fields from same-name RecordDecl as ambiguous.
rdar://80184238
Differential Revision: https://reviews.llvm.org/D106994
Empty packs in the non-final position would result in an extra ", ".
Empty packs in the final position would result in missing the space
between trailing >>.
This patch renames the vector clear left/right builtins vec_clrl, vec_clrr to
vec_clr_first and vec_clr_last to avoid the ambiguities when dealing with endianness.
Reviewed By: amyk, lei
Differential revision: https://reviews.llvm.org/D108702
Clang only adds GCC paths for RHEL <= 7 'devtoolset-<N>' Software
Collections (SCL). This generalizes this support to also include the
'gcc-toolset-10' SCL in RHEL/CentOS 8.
Reviewed By: stephan.dollberg
Differential Revision: https://reviews.llvm.org/D108908
The "aligned" attribute can only increase the alignment of a struct, or struct member, unless it's used together with the "packed" attribute, or used as a part of a typedef, in which case, the "aligned" attribute can both increase and decrease alignment.
That said, we expect:
1. "aligned" attribute alone: does not interfere with the alignment upgrade instrumented by the AIX "power" alignment rule,
2. "aligned" attribute + typedef: overrides any computed alignment,
3. "aligned" attribute + "packed" attribute: overrides any computed alignment.
The old implementation achieved 2 and 3, but didn't get 1 right, in that any field marked attribute "aligned" would not go through the alignment upgrade.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D107394
Add backward compatibility tests for mapping the deprecated
ConstructorInitializerAllOnOneLineOrOnePerLine and
AllowAllConstructorInitializersOnNextLine to
PackConstructorInitializers.
Differential Revision: https://reviews.llvm.org/D108882
Extend the information preserved in `TypeInfo` by replacing the `AlignIsRequired` bool flag with a three-valued enum, the enum also indicates where the alignment attribute come from, which could be helpful in determining whether the attribute should overrule.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D108858
Reading the AST block can never fail with a recoverable error as modules
cannot be removed during this phase. Change the return type of these
functions to return an llvm::Error instead, ie. either success or
failure.
NFC other than the wording of some of the errors.
Differential Revision: https://reviews.llvm.org/D108268
The default value of the now deprecated
AllowAllConstructorInitializersOnNextLine was always true
regardless of the value of BasedOnStyle. This patch fixes the typo
in the code that handles the related backward compatibility.
LLVM 13.0.0-rc2 shows change of behaviour in enum and interface BraceWrapping (likely before we simply didn't wrap) but may be related to {D99840}
Logged as https://bugs.llvm.org/show_bug.cgi?id=51640
This change ensure AfterEnum works for
`internal|public|protected|private enum A {` in the same way as it works for `enum A {` in C++
A similar issue was also observed with `interface` in C#
Reviewed By: krasimir, owenpan
Differential Revision: https://reviews.llvm.org/D108810
This seem to be a regression caused by this change:
https://reviews.llvm.org/D60943.
Since we delayed report the error, we would run into some invalid
state in clang and llvm.
Without this fix, clang would assert when passing function into
inline asm's input operand.
Differential Revision: https://reviews.llvm.org/D107941
Add a new option PackConstructorInitializers and deprecate the
related options ConstructorInitializerAllOnOneLineOrOnePerLine and
AllowAllConstructorInitializersOnNextLine. Below is the mapping:
PackConstructorInitializers ConstructorInitializer... AllowAll...
Never - -
BinPack false -
CurrentLine true false
NextLine true true
The option value Never fixes PR50549 by always placing each
constructor initializer on its own line.
Differential Revision: https://reviews.llvm.org/D108752
MallocOverflow works in two phases:
1) Collects suspicious malloc calls, whose argument is a multiplication
2) Filters the aggregated list of suspicious malloc calls by iterating
over the BasicBlocks of the CFG looking for comparison binary
operators over the variable constituting in any suspicious malloc.
Consequently, it suppressed true-positive cases when the comparison
check was after the malloc call.
In this patch the checker will consider the relative position of the
relation check to the malloc call.
E.g.:
```lang=C++
void *check_after_malloc(int n, int x) {
int *p = NULL;
if (x == 42)
p = malloc(n * sizeof(int)); // Previously **no** warning, now it
// warns about this.
// The check is after the allocation!
if (n > 10) {
// Do something conditionally.
}
return p;
}
```
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D107804
Lets wavefront size be 32 for amdgpu openmp, as well as 64.
Fixes up as little as possible to pass that through the libraries. This change
is end to end, as opposed to updating clang/devicertl/plugin separately. It can
be broken up for review/commit if preferred. Posting as-is so that others with
a gfx10 can try it out. It works roughly as well as gfx9 for me, but there are
probably bugs remaining as well as the todo: for letting grid values vary more.
Reviewed By: ronlieb
Differential Revision: https://reviews.llvm.org/D108708
Fixes PR51626.
The M68k requires that all instruction, word and long word reads are
aligned to word boundaries. From the 68020 onwards, there is a
performance benefit from aligning long words to long word boundaries.
The M68k uses the same data layout for pointers and integers.
In line with this, this commit updates the pointer data layout to
match the layout already set for 32-bit integers: 32:16:32.
Differential Revision: https://reviews.llvm.org/D108792
Not only global variables can hold references to dead stack variables.
Consider this example:
void write_stack_address_to(char **q) {
char local;
*q = &local;
}
void test_stack() {
char *p;
write_stack_address_to(&p);
}
The address of 'local' is assigned to 'p', which becomes a dangling
pointer after 'write_stack_address_to()' returns.
The StackAddrEscapeChecker was looking for bindings in the store which
referred to variables of the popped stack frame, but it only considered
global variables in this regard. This patch relaxes this, catching
stack variable bindings as well.
---
This patch also works for temporary objects like:
struct Bar {
const int &ref;
explicit Bar(int y) : ref(y) {
// Okay.
} // End of the constructor call, `ref` is dangling now. Warning!
};
void test() {
Bar{33}; // Temporary object, so the corresponding memregion is
// *not* a VarRegion.
}
---
The return value optimization aka. copy-elision might kick in but that
is modeled by passing an imaginary CXXThisRegion which refers to the
parent stack frame which is supposed to be the 'return slot'.
Objects residing in the 'return slot' outlive the scope of the inner
call, thus we should expect no warning about them - except if we
explicitly disable copy-elision.
Reviewed By: NoQ, martong
Differential Revision: https://reviews.llvm.org/D107078
Clang currently picks the second tentative definition when
VarDecl::getActingDefinition is called.
This can lead to attributes being dropped if they are attached to
tentative definitions that appear after the second one. This is
because VarDecl::getActingDefinition loops through VarDecl::redecls
assuming that the last tentative definition is the last element in the
iterator. However, it is the second element that would be the last
tentative definition.
This changeset modifies getActingDefinition to iterate through the
declaration chain in reverse, so that it can immediately return when
it encounters a tentative definition.
Originally the unit test for this changeset did not have a -triple
flag for the clang invocation, leading to this test being broken on
MacOS, since Mach-O does not support the section attribute.
Differential Revision: https://reviews.llvm.org/D99732