OpenMP Spec 5.0 [2.12.5, Restrictions]: If a device clause in which the
ancestor device-modifier appears is present on the target construct,
then a requires directive with the reverse_offload clause must be
specified.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D118887
Remove the vector base class as suggested by @ldionne
Reviewed By: ldionne, Quuxplusone, #libc
Spies: libcxx-commits, ldionne
Differential Revision: https://reviews.llvm.org/D117108
With this patch pre-conditions can be added to a list of constraints.
Constraints with pre-conditions can only be used if all pre-conditions
are satisfied when the constraint is used.
The pre-conditions at the moment are specified as a list of
(Predicate, Value *,Value *) tuples. This allow easily checking them
like any other condition, using the existing infrastructure.
This then is used to limit GEP decomposition to cases where we can
prove that offsets are signed positive.
This fixes a couple of incorrect transforms where GEP offsets where
assumed to be signed positive, but they were not.
Note that this effectively disables GEP decomposition, as there's no
support for reasoning about signed predicates. D118806 adds initial
signed support.
Fixes PR49624.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D118799
The second space in `void foo() &` is always produced by clang-format,
and isn't evidence of any particular style.
Before this patch, it was considered evidence of PAS_Right, because
there is a space before a pointerlike ampersand.
This caused the following code to have "unstable" pointer alignment:
void a() &;
void b() &;
int *x;
PAS_Left, Derive=false would produce 'int* x' with other lines unchanged.
But subsequent formatting with Derive=true would produce 'int *x' again.
Differential Revision: https://reviews.llvm.org/D118921
Some types (e.g. `_Bool`) have different scalar and memory representations. CodeGen for `va_arg` didn't take this into account, leading to an assertion failures with different types.
This patch makes sure we use memory representation for `va_arg`.
Reviewed By: ahatanak
Differential Revision: https://reviews.llvm.org/D118904
This header is very large (3M Lines once expended) and was included in location
where dwarf-specific information were not needed.
More specifically, this commit suppresses the dependencies on
llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and
llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used,
this has a decent impact on number of preprocessed lines generated during
compilation of LLVM, as showcased below.
This is achieved by moving some definitions back to the .cpp file, no
performance impact implied[0].
As a consequence of that patch, downstream user may need to manually some extra
files:
llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h
llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h
In some situations, codes maybe relying on the fact that
llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden
dependency now needs to be explicit.
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after: 10978519
before: 11245451
Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
[0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions
Differential Revision: https://reviews.llvm.org/D118781
This patch adds partial lowering of the "GET_COMMAND_ARGUMENT"
intrinsic to the backend runtime hook implemented in patches D109227,
D109813, D109814.
Differential Revision: https://reviews.llvm.org/D118801
This patch handles the quiet argument in the STOP statement. It adds
ability to lower LOGICAL constant.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: kiranchandramohan, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D118897
Co-authored-by: Jean Perier <jperier@nvidia.com>
This makes the statepoint methods in IRBuilder accept a
FunctionCallee, which carries both the callee and function type.
This is used to add the elementtype attribute to the statepoint call.
RS4GC requires an additional tweak to actually preserve that attribute
-- previously the attributes on the call were completely overwritten.
Differential Revision: https://reviews.llvm.org/D118886
In the aftermath of D116895 a problem was found in the analysis of
dependencies between store merge candidates in
checkMergeStoreCandidatesForDependencies, that is needed to avoid
the cycles are introduced in the DAG.
In the past it has been enough (or assumed to be enough) to start
scanning from non-chain operands when analysing the store merge
candidates for dependencies, assuming that the analysis of chain
dependencies performed when finding the candidates would cover
up for potential dependencies that exist involving the chain operands.
It was however discovered that one could end up with scenarios such
as descibed in the aarch64-checkMergeStoreCandidatesForDependencies.ll
test case, when the dependency between two stores is given by a mix
of chain operand dependencies and non-chain operand dependencies.
The fix in this patch make sure that we also account for chain operand
dependencies when doing the more elaborate analysis in
checkMergeStoreCandidatesForDependencies, no longer relying on that
the earlier check involving chain operands is enough.
Differential Revision: https://reviews.llvm.org/D118943
The renames the output_iterator to cpp17_output_iterator. These
iterators are still used in C++20 so it's not possible to change the
current type to the new C++20 requirements. This is done in a similar
fashion as the cpp17_input_iterator.
Reviewed By: #libc, Quuxplusone, ldionne
Differential Revision: https://reviews.llvm.org/D117950
Simple pass that changes all symbols to private unless symbol is excluded (and
in which case there is no change to symbol's visibility).
Differential Revision: https://reviews.llvm.org/D118752
This adds the assertion that all items in the ready list are in-fact scheduleable entities ready to be scheduled. This involves changing the ReadyInsts structure to be a set, and fixing a couple places where we left nodes on the list when they were no longer ready.
The tools used by test-suite are originally configured to compile with cc by
default, and this is dictated by TEST_SUITE_HOST_CC.
However, it is possible that on some systems that the version of cc may either
not be present or it may not be able to compile the tools as it may be too old,
which could be an issue seen during release testing.
This patch updates the compiler to be the default build compiler that is used
for release testing. If no such compiler it specified, then cc will be set as
the test-suite tools build compiler by default (as it already is set under
TEST_SUITE_HOST_CC).
Differential Revision: https://reviews.llvm.org/D118357
Finding re-materialization chain for derived pointer does not depend on
call site. To avoid this finding for each call site it can be extracted in
a separate routine.
Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D118676
The linker wrapper tool uses the 'nvlink' and 'ptxas' binaries to link
and assemble device files. Previously we searched for this using the
binaries in the user's path. This didn't work in cases where the user
passed in a specific Cuda path to Clang. This patch changes the linker
wrapper to accept an argument for the Cuda path we can get from Clang.
This should fix#53573.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D118944
regcomp.c uses the "start + count < end" idiom to check that there are
"count" bytes available in an array of char "start" and "end" both point
to.
This is fine, unless "start + count" goes beyond the last element of the
array. In this case, pedantic interpretation of the C standard makes
the comparison of such a pointer against "end" undefined, and optimizers
from hell will happily remove as much code as possible because of this.
An example of this occurs in regcomp.c's bothcases(), which defines
bracket[3], sets "next" to "bracket" and "end" to "bracket + 2". Then it
invokes p_bracket(), which starts with "if (p->next + 5 < p->end)"...
Because bothcases() and p_bracket() are static functions in regcomp.c,
there is a real risk of miscompilation if aggressive inlining happens.
The following diff rewrites the "start + count < end" constructs into
"end - start > count". Assuming "end" and "start" are always pointing in
the array (such as "bracket[3]" above), "end - start" is well-defined
and can be compared without trouble.
As a bonus, MORE2() implies MORE() therefore SEETWO() can be simplified
a bit.
Bug report: https://github.com/llvm/llvm-project/issues/47993
Reviewed By: MaskRay, vitalybuka
Differential Revision: https://reviews.llvm.org/D97129
Earlier in LLD's evolution, I tried to create the illusion that
subsections were indistinguishable from "top-level" sections. Thus, even
though the subsections shared many common field values, I hid those
common values away in a private Shared struct (see D105305). More
recently, however, @gkm added a public `Section` struct in D113241 that
served as an explicit way to store values that are common to an entire
set of subsections (aka InputSections). Now that we have another "common
value" struct, `Shared` has been rendered redundant. All its fields can
be moved into `Section` instead, and the pointer to `Shared` can be replaced
with a pointer to `Section`.
This `Section` pointer also has the advantage of letting us inspect other
subsections easily, simplifying the implementation of {D118798}.
P.S. I do think that having both `Section` and `InputSection` makes for
a slightly confusing naming scheme. I considered renaming `InputSection`
to `Subsection`, but that would break the symmetry with `OutputSection`.
It would also make us deviate from LLD-ELF's naming scheme.
This change is perf-neutral on my 3.2 GHz 16-Core Intel Xeon W machine:
base diff difference (95% CI)
sys_time 1.258 ± 0.031 1.248 ± 0.023 [ -1.6% .. +0.1%]
user_time 3.659 ± 0.047 3.658 ± 0.041 [ -0.5% .. +0.4%]
wall_time 4.640 ± 0.085 4.625 ± 0.063 [ -1.0% .. +0.3%]
samples 49 61
There's also no stat sig change in RSS (as measured by `time -l`):
base diff difference (95% CI)
time 998038627.097 ± 13567305.958 1003327715.556 ± 15210451.236 [ -0.2% .. +1.2%]
samples 31 36
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D118797
Now that the minimum version version of MSVC required to build LLVM has
been bumped, we see
../../llvm/include\llvm/Support/Compiler.h(94,2): error: LLVM requires
at least VS 2019.
#error LLVM requires at least VS 2019.
e.g. http://45.33.8.238/win/53703/step_4.txt
1920 corresponds to the earliest version of VS 2019.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D118713
Remove the `StringRef` summary provider in favor of the implementation in
`llvm/utils/lldbDataFormatters.py`.
This implementation was resulting in errors in some cases.
Differential Revision: https://reviews.llvm.org/D116987
As per [time.duration.cons]/1, the constructor constraint should be on
const Rep2&. As it is now the code will fail to compile in certain
cases, for example (https://godbolt.org/z/c7fPrcTYM):
struct S{
operator int() const&& noexcept = delete;
operator int() const& noexcept;
};
const S &fun();
auto k = std::chrono::microseconds{fun()};
Differential Revision: https://reviews.llvm.org/D118902
This adds support for automatically cherry-picking and testing fixes for the
release branch using 'commands' in issue comments. The two supported commands are:
/cherry-pick <commit1> <commit2> ...
Which will backport and test commits from main. And also
/branch owner/repo/branch
Which will test commits from the given branch.
Reviewed By: alexbatashev, kwk
Differential Revision: https://reviews.llvm.org/D117386
This change extends the RawMemProfReader to read all the sections of the
raw profile and symbolize the virtual addresses recorded as part of the
callstack for each allocation. For now the symbolization is used to
display the contents of the profile with llvm-profdata.
Differential Revision: https://reviews.llvm.org/D116784
Print out the profile summary in YAML format to make it easier to for
tools and tests to read in the contents of the raw profile.
Differential Revision: https://reviews.llvm.org/D116783
This change templatizes the InstrProfIterator where the default
specialization is based on the current usage, i.e. the reader_type is
InstrProfReader and the record_type (value_type) is
NamedInstrProfRecord. A subsequent patch will use the same iterator
template to implement an iterator for the RawMemProfReader.
Differential Revision: https://reviews.llvm.org/D116782
This change moves the SymbolizableObjectFile header to
include/llvm/DebugInfo/Symbolize. Making this header available to other
llvm libraries simplifies use cases where implicit caching, multiple
platform support and other features of the Symbolizer class are not
required. This also makes the dependent libraries easier to unit test
by having mocks which derive from SymbolizableModule.
Differential Revision: https://reviews.llvm.org/D116781
Similar to the G_*MULO change.
The code for checking if a constant is legal/pre-legalize is shared between
these, and is kind of hairy. So, factor it out into a new function:
`isConstantLegalOrBeforeLegalizer`.
To make the refactoring clean, further refactor `isLegalOrBeforeLegalizer` into
a wrapper for two functions:
- `isPreLegalize`
- `isLegal`
This is a bit easier to read in general.
https://godbolt.org/z/KW7oszP1o
Differential Revision: https://reviews.llvm.org/D118655
Similar to the following combine in `DAGCombiner::visitMULO`:
```
// fold (mulo x, 0) -> 0 + no carry out
if (isNullOrNullSplat(N1))
return CombineTo(N, DAG.getConstant(0, DL, VT),
DAG.getConstant(0, DL, CarryVT));
```
This fixes some generally poor codegen for `*mulo`:
https://godbolt.org/z/eTxYsvz8f
Differential Revision: https://reviews.llvm.org/D118635
I see a lot of array sorting in stack traces of our compiler, canonicalizer traverses this list every time it builds a pattern set, and it gets expensive very quickly.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D118937