This patch fixes various issues with our prior `declare target` handling
and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to
avoid the "ref" global use introduced for globals with internal linkage.
From there it went down the rabbit hole, e.g., all variables, even
`nohost` ones, were emitted into the device code so it was impossible to
determine if "ref" was needed late in the game (based on the name only).
To make it really useful, `begin declare target` was needed as it can
carry the `device_type`. Not emitting variables eagerly had a ripple
effect. Finally, the precedence of the (explicit) declare target list
items needed to be taken into account, that meant we cannot just look
for any declare target attribute to make a decision. This caused the
handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse
declarations and defintions" as part of OpenMP parsing, this will always
break at some point. Instead, we keep track what region we are in and
act on definitions and declarations instead, this is what we do for
declare variant and other begin/end directives already.
Highlights:
- new diagnosis for restrictions specificed in the standard,
- delayed emission of globals not mentioned in an explicit
list of a declare target,
- omission of `nohost` globals on the host and `host` globals on the
device,
- no explicit parsing of declarations in-between `omp [begin] declare
variant` and the corresponding end anymore, regular parsing instead,
- precedence for explicit mentions in `declare target` lists over
implicit mentions in the declaration-definition-seq, and
- `omp allocate` declarations will now replace an earlier emitted
global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do
on their own lead to "inconsistent states", which seem less desirable
overall.
After working through this I feel the standard should remove the
explicit declare target forms as the delayed emission is horrible.
That said, while we delay things anyway, it seems to me we check too
often for the current status even though that is often not sufficient to
act upon. There seems to be a lot of duplication that can probably be
trimmed down. Eagerly emitting some things seems pretty weak as an
argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
This patch adds support for two new variants of the vectorize_width
pragma:
1. vectorize_width(X[, fixed|scalable]) where an optional second
parameter is passed to the vectorize_width pragma, which indicates if
the user wishes to use fixed width or scalable vectorization. For
example the user can now write something like:
#pragma clang loop vectorize_width(4, fixed)
or
#pragma clang loop vectorize_width(4, scalable)
In the absence of a second parameter it is assumed the user wants
fixed width vectorization, in order to maintain compatibility with
existing code.
2. vectorize_width(fixed|scalable) where the width is left unspecified,
but the user hints what type of vectorization they prefer, either
fixed width or scalable.
I have implemented this by making use of the LLVM loop hint attribute:
llvm.loop.vectorize.scalable.enable
Tests were added to
clang/test/CodeGenCXX/pragma-loop.cpp
for both the 'fixed' and 'scalable' optional parameter.
See this thread for context: http://lists.llvm.org/pipermail/cfe-dev/2020-November/067262.html
Differential Revision: https://reviews.llvm.org/D89031
OpenMP 5.0 supports nested declare target regions. So, in general,it is
allow to mark a declarationas declare target with different device_type
or link type. Patch adds support for such kind of nesting.
Differential Revision: https://reviews.llvm.org/D86239
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.
Reviewed By: fghanim
Differential Revision: https://reviews.llvm.org/D77112
See rational here: https://reviews.llvm.org/D76173#1922916
Time to compile Attr.h in isolation goes from 2.6s to 1.8s.
Original patch by Johannes, plus some additions from Reid to fix some
clang tooling targets.
Effect on transitive includes is marginal, though:
$ diff -u <(sort thedeps-before.txt) <(sort thedeps-after.txt) \
| grep '^[-+] ' | sort | uniq -c | sort -nr
104 - /usr/local/google/home/rnk/llvm-project/clang/include/clang/AST/OpenMPClause.h
87 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Frontend/OpenMP/OMPContext.h
19 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/SmallSet.h
19 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/SetVector.h
14 - /usr/include/c++/9/set
...
Differential Revision: https://reviews.llvm.org/D76184
This is a cleanup and normalization patch that also enables reuse with
Flang later on. A follow up will clean up and move the directive ->
clauses mapping.
Differential Revision: https://reviews.llvm.org/D77112
This has very little impact on build time, but is a mechanical pre-req
to removing the OpenMPClause.h include, which matters. Most of these
pretty print methods require Expr to be complete.
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
We know all subclasses in tblgen so just generate a giant switch for
the few virtual methods or turn them into a member variable using spare
bits. The giant jump tables aren't pretty but still much smaller than
a vtable for every attribute, shrinking Release+Asserts clang by ~400k.
Also halves the size of the Attr base class. No functional change
intended.
llvm-svn: 232726
Required making a handful of changes to the table generator. Also adds
an unspecified inheritance attribute. This opens the path for us to
apply these attributes to C++ records implicitly.
llvm-svn: 178054
uncovered.
This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.
I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.
llvm-svn: 169237
Now all classes derived from Attr are generated from TableGen.
Additionally, Attr* is no longer its own linked list; SmallVectors or
Attr* are used. The accompanying LLVM commit contains the updates to
TableGen necessary for this.
Some other notes about newly-generated attribute classes:
- The constructor arguments are a SourceLocation and a Context&,
followed by the attributes arguments in the order that they were
defined in Attr.td
- Every argument in Attr.td has an appropriate accessor named getFoo,
and there are sometimes a few extra ones (such as to get the length
of a variadic argument).
Additionally, specific_attr_iterator has been introduced, which will
iterate over an AttrVec, but only over attributes of a certain type. It
can be accessed through either Decl::specific_attr_begin/end or
the global functions of the same name.
llvm-svn: 111455
"editing" mode, introduce a separate function
clang_defaultEditingTranslationUnitOptions() that retrieves the set of
options. No functionality change.
llvm-svn: 110613
Currently, there are two effective changes:
- Attr::Kind has been changed to attr::Kind, in a separate namespace
rather than the Attr class. This is because the enumerator needs to
be visible to parse.
- The class definitions for the C++0x attributes other than aligned are
generated by TableGen.
The specific classes generated by TableGen are controlled by an array in
TableGen (see the accompanying commit to the LLVM repository). I will be
expanding the amount of code generated as I develop the new attributes system
while initially keeping it confined to these attributes.
llvm-svn: 106172
array associated with NonNullAttr. This fixes yet another leak when
ASTContext uses a BumpPtrAllocator.
Fixes: <rdar://problem/7637150>
llvm-svn: 95863
array allocated using the allocator in ASTContext. This addresses
these strings getting leaked when using a BumpPtrAllocator (in
ASTContext).
Fixes: <rdar://problem/7636765>
llvm-svn: 95853
attribute, so it uses Anton's new target-specific attribute support. It's
supposed to ensure that the stack is 16-byte aligned, but since necessary
support is lacking from LLVM, this is a no-op for now.
llvm-svn: 95820