This patch adds a CL option for avoiding the attribute compatibility
check between caller and callee in TTI. TTI attribute compatibility
checks for target CPU and target features.
In our downstream compiler, this attribute always remains the same
between callee and caller. By avoiding the addition of this attribute to
each of our inline candidate (and then checking them here during inline
cost), we save some compile time.
The option is kept false, so this change is an NFC upstream.
Outside users of the Attributor, e.g., OpenMP-opt, want to seed AAs
themselves. We should not seed all default AAs one an internal function
becomes live. That said, there should be a callback such that they can
do lazy seeding as well.
Differential Revision: https://reviews.llvm.org/D121489
There was some ad-hoc handling of liveness and manifest to avoid
breaking CGSCC guarantees. Things always slipped through though.
This cleanup will:
1) Prevent us from manifesting any "information" outside the CGSCC.
This might be too conservative but we need to opt-in to annotation
not try to avoid some problematic ones.
2) Avoid running any liveness analysis outside the CGSCC. We did have
some AAIsDeadFunction handling to this end but we need this for all
AAIsDead classes. The reason is that AAIsDead information is only
correct if we actually manifest it, since we don't (see point 1) we
cannot actually derive/use it at all. We are currently trying to
avoid running any AA updates outside the CGSCC but that seems to
impact things quite a bit.
3) Assert, don't check, that our modifications (during cleanup) modifies
only CGSCC functions.
The opt-out from rL236267 (2015) is untested and seems no longer needed (or not
needed when rL236267 was committed): there is nothing special with uncompressed
alignment. This brings us in line with GCC which compresses .debug_frame .
Checked that -g -fno-asynchronous-unwind-tables + objcopy
--decompress-debug-sections output is identical to -g
-fno-asynchronous-unwind-tables -gz + objcopy --decompress-debug-sections
output.
`ReadMemoryFromFileCache` can be called at a high rate, and has fast execution.
Signposts for high rate & brief duration can have a negative impact on tracing;
emitting a high volume signposts can lead to blocking, affecting performance,
and total volume makes human review of the trace harder because of the noise.
Differential Revision: https://reviews.llvm.org/D121226
* It doesn't required by OpenCL/Intel Level Zero and can be set programmatically.
* Add GPU to spirv lowering in case when attribute is not present.
* Set higher benefit to WorkGroupSizeConversion pattern so it will always try to lower first from the attribute.
Differential Revision: https://reviews.llvm.org/D120399
Early adoption of new technologies or adjusting certain code generation/IR optimization thresholds
is often available through some cl::opt options (which have unstable surfaces).
Specifying such an option twice will lead to an error.
```
% clang -c a.c -mllvm -disable-binop-extract-shuffle -mllvm -disable-binop-extract-shuffle
clang (LLVM option parsing): for the --disable-binop-extract-shuffle option: may only occur zero or one times!
% clang -c a.c -mllvm -hwasan-instrument-reads=0 -mllvm -hwasan-instrument-reads=0
clang (LLVM option parsing): for the --hwasan-instrument-reads option: may only occur zero or one times!
% clang -c a.c -mllvm --scalar-evolution-max-arith-depth=32 -mllvm --scalar-evolution-max-arith-depth=16
clang (LLVM option parsing): for the --scalar-evolution-max-arith-depth option: may only occur zero or one times!
```
The option is specified twice, because there is sometimes a global setting and
a specific file or project may need to override (or duplicately specify) the
value.
The error is contrary to the common practice of getopt/getopt_long command line
utilities that let the last option win and the `getLastArg` behavior used by
Clang driver options. I have seen such errors for several times. I think the
error just makes users inconvenient, while providing very little value on
discouraging production usage of unstable surfaces (this goal is itself
controversial, because developers might not want to commit to a stable surface
too early, or there is just some subtle codegen toggle which is infeasible to
have a driver option). Therefore, I suggest we drop the diagnostic, at least
before the diagnostic gets sufficiently better support for the overridding needs.
Removing the error is a degraded error checking experience. I think this error
checking behavior, if desirable, should be enabled explicitly by tools. Users
preferring the behavior can figure out a way to do so.
Reviewed By: jhenderson, rnk
Differential Revision: https://reviews.llvm.org/D120455
This patch adds a getter for the process' system architecture. I went
with Process::GetSystemArchitecture to match
Platform::GetSystemArchitecture.
Differential revision: https://reviews.llvm.org/D121443
We already do this for RISCVISD::VFMV_S_F_VL and the vfmv.v.f
intrinsic.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D121429
64 bit mach-o files have sections that only have 32 bit file offsets. If dsymutil tries to produce an invalid mach-o file, then error out with a good error string.
Differential Revision: https://reviews.llvm.org/D121398
This patch adds couple of tests for allocatable
globals and missing lowering for them
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D121473
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
This patch adds tests and missing lowering
code to lower elemental function/subroutine calls
in array expression
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D121474
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Don't overwrite the host architecture (obtained from qHostInfo) with the
process info (obtained from qProcessInfo).
Differential revision: https://reviews.llvm.org/D121442
This is a part of optimized callback reverts. This is needed to export the callbacks from the rt-asan libraries.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D121464
We should not be using APIs here that try to fetch the attribute
from both the call attributes and the function attributes. Otherwise
we'll try to upgrade a non-existent sret attribute on the call using
the attribute on the function.
While we don't need to check the element type in this case, we
do need to make sure that the pointers have the same address space,
otherwise RAUW will assert.
https://reviews.llvm.org/D120568 broke builds that set
both `LLVM_BUILD_LLVM_DYLIB` and `LLVM_LINK_LLVM_DYLIB`. This patch
fixes that.
The build failure was caused by the fact that some LLVM libraries (which
are also LLVM components) were listed directly as link-time dependencies
instead of using `LINK_COMPONENTS` in CMake files. This lead to a linker
invocation like this (simplified version to demonstrate the problem):
```
ld lib/libLLVM.so lib/libLLVMAnalysis.a lib/libLLVMTarget.a
```
That's problematic and unnecessary because `libLLVM.so` incorporates
`libLLVMAnalysis` and `libLLVMTarget`. A correct invocation would look
like this (`LLVM_LINK_LLVM_DYLIB` _is not_ set):
```
ld lib/libLLVMAnalysis.a lib/libLLVMTarget.a
```
or this (`LLVM_LINK_LLVM_DYLIB` _is_ set):
```
ld lib/libLLVM.so
```
Differential Revision: https://reviews.llvm.org/D121461
In an attempt to remove the memory leak we introduced a double free.
The problem was that we allowed a plain copy of the state and it was
actually used. The use was useless, so it is gone now. The copy
constructor is gone as well. The move constructor ensures the Accesses
pointers are owned by a single state, I hope.
Reported by: https://lab.llvm.org/buildbot/#/builders/16/builds/25820
Add operations -, abs, ceil and floor to the index notation.
Add test cases.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D121388
This patch implements support for the +, -, *, / and % operators on sizeless SVE
types. Support for these operators on svbool_t is excluded.
Differential Revision: https://reviews.llvm.org/D120323
The constexpr source element type was enumerated if the GEP was
used as part of an instruction. However, things like global
initializers go through a different code path, and we need to
enumerate the type there as well.
Since D101045, allocas are no longer required to be part of the
default alloca address space. There may be allocas in multiple
different address spaces. However, the bitcode reader would
simply assume the default alloca address space, resulting in
either an error or incorrect IR.
Add an optional record for allocas which encodes the address
space.
We don't need preprocessor logic to exclude those declarations when compiling for
the Windows App Store, because that is handled by using_if_exists now.
Differential Revision: https://reviews.llvm.org/D108632
Instead of carrying around #ifdefs to determine whether those functions
are available on the platform, unconditionally use the using_if_exists
attribute to import it into namespace std only when available. That was
the purpose of this attribute from the start.
This change means that trying to use libc++ with an old SDK (or on an
old platform for platforms that ship system headers in /usr/include)
will require a recent Clang that supports the using_if_exists attribute.
When using an older Clang or GCC, the underlying platform has to support
a C11 standard library.
Differential Revision: https://reviews.llvm.org/D108203
As a fly-by fix, also move it closer to where it is needed, and add a
comment explaining the existence of this weird function.
Differential Revision: https://reviews.llvm.org/D121231
Per LangRef, swifterror alloca must be a pointer.
Not checking this may result in a verifier error after transforms
instead, so make sure it's discarded early.