Autogenerated names are too long and break compilation on Windows,
while we do not need this enum at all.
Differential Revision: https://reviews.llvm.org/D119869
Reported by Stefan Pintilie in D119773.
For a branch to a hidden undefined weak symbol, there is an
`assert(sym->getVA());` failure in PPC64LongBranchTargetSection::writeTo for a
-no-pie link. The root cause is that we unnecessarily create the thunk for the
-no-pie link.
Fix this by changing the condition to just `s.isUndefined()`. See the inline
comment.
Rename ppc64-weak-undef-call.s to ppc64-undefined-weak.s to be consistent with
other architectures.
Reviewed By: sfertile, stefanp
Differential Revision: https://reviews.llvm.org/D119787
Also, fix the actual code so that the test would pass if we fixed the
issue that the method is instantiated in the dylib, and hence the debug
assertion will never fire except if the debug mode is enabled when the
dylib is being compiled.
This allow user to register a callback that can annotate operations
during software pipelining. This allows user potential annotate op to
know what part of the pipeline they correspond to.
Differential Revision: https://reviews.llvm.org/D119866
We can now configure the space between requires and the following paren,
seperate for clauses and expressions.
Differential Revision: https://reviews.llvm.org/D113369
Detect requires expressions in more unusable contexts. This is far from
perfect, but currently we have no good metric to decide between a
requires expression and a trailing requires clause.
Differential Revision: https://reviews.llvm.org/D119138
Without results, there is no getType injected and so generating one in prefixed form doesn't result in any failures during C++ compilation.
Differential Revision: https://reviews.llvm.org/D119871
constexpr var may be initialized with address of non-const variable.
In this case the initializer is not constant in device compilation.
This has been handled for const vars but not for constexpr vars.
This patch makes handling of const var and constexpr var
consistent.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D119615
Fixes: https://github.com/llvm/llvm-project/issues/53780
This patch adds a new target to the OpenMP CPU offloading tests. This
tests the usage of the new driver for CPU offloading. If this all works
then we can move to transition to the new driver as the default.
Depends on D119613
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D119736
This patch adds support for linking CPU offloading applications in the
linker wrapper. We generate the necessary linking job using the host
linker's path and library arguments. This may not be true for more
complex offloading schemes, but this is sufficient for now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D119613
This test shows that when access patterns do not match (e.g. transposing
a row-wise sparse matrix into another row-wise sparse matrix), a conversion
operation in between can enable codegen (i.e. avoid cycle in iteration graph).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D119864
Layering-wise, it seems RegisterBank stuff fits under CodeGen, like
other target abstraction.
In particular, TargetSubtargetInfo has a getRegBankInfo member, but
using that object requires making sure GlobalISel is linked, which is
not always the case (e.g. llvm-jitlink doesn't).
Differential Revision: https://reviews.llvm.org/D119053
Track source location information when available for actual arguments
to procedure references, and use this information when checking constraints
on calls so that error messages refer to specific actual arguments
rather than to the entire call.
Differential Revision: https://reviews.llvm.org/D119849
Optional parameters with `defaultValue` set will be populated with that value if they aren't encountered during parsing. Moreover, parameters equal to their default values are elided when printing.
Depends on D118210
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D118544
When compiling with Clang modules enabled, polly's use of using-directives
caused the global object `Target` in RegisterPasses.cpp to clash with
`llvm::Target`. By eliminating the using-directives, we're able to get
polly to play nicely with a modules build.
Differential Revision: https://reviews.llvm.org/D119809
Calls to C_F_POINTER() without the optional SHAPE= third argument
were failing to be recognized as proper calls to the intrinsic,
but the failure was not generating any error message. This led to
a crash in lowering, which rightfully expects a typed expression
to be associated with the call.
So (1) catch silent failures to convert CALL statements as internal
errors, as is done for expressions and assignment statements; and
(2) clean up C_F_POINTER intrinsic handling to cope with only two
arguments and to emit an error for a FPTR= argument with no type.
Differential Revision: https://reviews.llvm.org/D119847
This change is needed when lowering alloca()-using code on targets
such as ROCDL that represent private scratch space as a separate
address space.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D119775
The `__builtin_pdepd` and `__builtin_pextd` are P10 builtins that are meant to
be used under 64-bit only. For instance, when the builtins are compiled under
32-bit mode:
```
$ cat t.c
unsigned long long foo(unsigned long long a, unsigned long long b) {
return __builtin_pextd(a,b);
}
$ clang -c t.c -mcpu=pwr10 -m32
ExpandIntegerResult #0: t31: i64 = llvm.ppc.pextd TargetConstant:i32<6928>, t28, t29
fatal error: error in backend: Do not know how to expand the result of this operator!
```
This patch adds sema checking for these builtins to compile under 64-bit
mode only and on P10. The builtins will emit a diagnostic when they are compiled on
non-P10 compilations and on 32-bit mode.
Differential Revision: https://reviews.llvm.org/D118753
EQUIVALENCE storage association of objects whose types are not
both default-kind numeric storage sequences, or not both default-kind
character storage sequences, are not standard conformant.
However, most Fortran compilers admit such usage, with warnings
in strict conformance mode. This patch allos EQUIVALENCE of objects
that have sequence types that are either identical, both numeric
sequences (of default kind or not), or both character sequences.
Non-sequence types, and sequences types that are not homogeneously
numeric or character, remain errors.
Differential Revision: https://reviews.llvm.org/D119848
Our best guess is that the two syntaxes should have exactly equivalent
effects, so, let's be consistent with what we do in libcxx/include/.
I've left `#include "include/x.h"` and `#include "../y.h"` alone
because I'm less sure that they're interchangeable, and they aren't
inconsistent with libcxx/include/ because libcxx/include/ never
does that kind of thing.
Also, use the `_LIBCPP_PUSH_MACROS/POP_MACROS` dance for `<__undef_macros>`,
even though it's technically unnecessary in a standalone .cpp file,
just so we have consistently one way to do it.
Differential Revision: https://reviews.llvm.org/D119561
`Shutdown.h` was transitively depending on two headers, but this isn't
allowed under a modules build, so they're now explicitly included.
Differential Revision: https://reviews.llvm.org/D119806
When a pointer assignment with bounds remapping has a function
reference as its right-hand side, don't check for array conformance.
Differential Revision: https://reviews.llvm.org/D119845
https://maskray.me/blog/2022-01-16-archives-and-start-lib
For every definition in an extracted archive member, we intern the symbol twice,
once for the archive index entry, once for the .o symbol table after extraction.
This is inefficient.
Symbols in a --start-lib ObjFile/BitcodeFile are only interned once because the
result is cached in symbols[i].
Just handle an archive using the --start-lib code path. We can therefore remove
ArchiveFile and LazyArchive. For many projects, archive member extraction ratio
is high and it is a net performance win. Linking a Release build of clang is
1.01x as fast.
Note: --start-lib scans symbols in the same order that llvm-ar adds them to the
index, so in the common case the semantics should be identical. If the archive
symbol table was created in a different order, or is incomplete, this strategy
may have different semantics. Such cases are considered user error.
The `is neither ET_REL nor LLVM bitcode` error is changed to a warning.
Previously an archive may have such members without a diagnostic. Using a
warning prevents breakage.
* For some tests, the diagnostics get improved where we did not consider
the archive member name: `b.a:` => `b.a(b.o):`.
* `no-obj.s`: the link is now allowed, matching GNU ld
* `archive-no-index.s`: the `is neither ET_REL nor LLVM bitcode` diagnostic is
demoted to a warning.
* `incompatible.s`: even when an archive is unextracted, we may report an
"incompatible with" error.
---
I recently decreased sizeof(SymbolUnion) by 8 and decreased memory usage quite a
bit, so retaining `symbols` for un-extracted archive members should not cause a
memory usage problem.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D119074
Casting between scalable vectors and fixed-length vectors doesn't make
sense. If one of the operands is scalable, the other has to be scalable
to be able to guarantee they have the same shape at runtime.
Differential Revision: https://reviews.llvm.org/D119568
Minor comment updates and use getVoidPtr helper instead of
builiding `i8*` type manually in codegen.
Differential Revision: https://reviews.llvm.org/D119828
vp.select|merge both select lanes based on a condition mask. Unlike
other VP intrinsics the lanes are defined where the condition mask is
false. Hence, the condition mask in vp.select|mask is not a mask in the
sense of VP intrinsics. By doing not treating the condition mask
specially, vp.select becomes the canonical VP translation of the select
instruction.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D118981
Simplify the logic when the exponent difference is at least MantissaLength + 2, while still maintaining correct rounding for all rounding modes.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D119843