In a prior review I was asked to move the helper function canIgnoreSNaN()
out to FPEnv.h. This wasn't possible at the time because that function
needs the fast math flags, and including them includes lots of other stuff
that isn't needed.
This patch moves the fast math flags out into a new FMF.h file unchanged,
and moves the helper function out to FPEnv.h also unchanged. This ticket
only moves code around.
Differential Revision: https://reviews.llvm.org/D119752
Part of the shift lowering creates a (sub XLEN-1, ShAmt). When this
value is used we know that ShAmt is [0..XLEN-1]. Since XLEN is a power
of 2 we can replace the sub with an xor. This allows us to use XORI
instead of LI+SUB.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D119411
Note: the term "libgcc" refers to the all of `libgcc.a`, `libgcc_eh.a`,
and `libgcc_s.so`.
Enabling libunwind as a replacement for libgcc on Linux has proven to be
challenging since libgcc_s.so is a required dependency in the [Linux
standard base][5]. Some software is transitively dependent on libgcc
because glibc makes hardcoded calls to functions in libgcc_s. For example,
the function `__GI___backtrace` eventually makes its way to a [hardcoded
dlopen to libgcc_s' _Unwind_Backtrace][1]. Since libgcc_{eh.a,s.so} and
libunwind have the same ABI, but different implementations, the two
libraries end up [cross-talking, which ultimately results in a
segfault][2].
To solve this problem, libunwind needs to build a “libgcc”. That is, link
the necessary functions from compiler-rt and libunwind into an archive
and shared object that advertise themselves as `libgcc.a`, `libgcc_eh.a`,
and `libgcc_s.so`, so that glibc’s baked calls are diverted to the
correct objects in memory. Fortunately for us, compiler-rt and libunwind
use the same ABI as the libgcc family, so the problem is solvable at the
llvm-project configuration level: no program source needs to be edited.
Thus, the end result is for a user to configure their LLVM build with a
flag that indicates they want to archive compiler-rt/unwind as libgcc.
We achieve this by compiling libunwind with all the symbols necessary
for compiler-rt to emulate the libgcc family, and then generate symlinks
named for our "libgcc" that point to their corresponding libunwind
counterparts.
We alternatively considered patching glibc so that the source doesn't
directly refer to libgcc, but rather _defaults_ to libgcc, so that a
system preferring compiler-rt/libunwind can point to these libraries
at the config stage instead. Even if we modified the Linux standard
base, this alternative won't work because binaries that are built using
libgcc will still end up having crosstalk between the differing
implementations.
This problem has been solved in this manner for [FreeBSD][3], and this
CL has been tested against [Chrome OS][4].
[1]: https://github.com/bminor/glibc/blob/master/sysdeps/arm/backtrace.c#L68
[2]: https://bugs.chromium.org/p/chromium/issues/detail?id=1162190#c16
[3]: https://github.com/freebsd/freebsd-src/tree/main/lib/libgcc_s
[4]: https://chromium-review.googlesource.com/c/chromiumos/overlays/chromiumos-overlay/+/2945947
[5]: https://refspecs.linuxbase.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/libgcc-s.html
Differential Revision: https://reviews.llvm.org/D108416
The clang-analyzer plugins are not linked to a particular tool, so they
can only be compiled if plugins are broadly supported. We could opt
instead to decide whether to link them to specifically against clang or
with undefined symbols, depending on the value of LLVM_ENABLE_PLUGINS,
but we do not currently expect there to be a use case for that rather
niche configuration.
Differential Revision: https://reviews.llvm.org/D119591
The copy and pristine versions of Utility diverged and one didn't
include <algorithm>. As that's a rather large header, let's just open
code the comparisons.
Reviewed By:
Differential Revision: https://reviews.llvm.org/
Enabled HW_REG_HW_ID as an alias for HW_REG_HW_ID1. This is required for compatibility with existing code.
Differential Revision: https://reviews.llvm.org/D119939
This patch passes in the AMDPGU math libraries to the linker wrapper.
The wrapper already handles linking OpenMP bitcode libraries via the
`--target-library` option. This should be sufficient to link in math
libraries for the accompanying architecture.
Fixes#53526.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D119841
This patch adds a new Darwin clang driver environment variable in the
spirit of RC_DEBUG_OPTIONS, called RC_DEBUG_PREFIX_MAP, which allows a
meta build tool to add one additional -fdebug-prefix-map entry without
the knowledge of the build system.
rdar://85224675
Differential Revision: https://reviews.llvm.org/D119850
The pad-slice swap pattern generates `scf.if` and `tensor.generate`
to guard against zero-sized slices if it cannot prove the slice is
always non-zero. This is safe but quite conservative. It can be
unnecessary for cases where we know by problem definition such cases
does not exist, even if with dynamic shaped ops or unknown tile/slice
sizes, e.g., convolution padding size = 1 with kernel dim size = 3.
So this commit introduces a control to the pattern to specify
whether to generate the if constructs to handle such cases better,
given that once the if constructs is materialized, it's very hard
to analyze and simplify.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D117017
It actually *is* important (for structured bindings) that `get(tuple)`
be ADL-able; but that's not the point of this test in particular.
Reviewed as part of D119860.
We shouldn't be calling `rethrow_exception` via ADL -- and neither should anybody
in the wild be calling it via ADL, so it's not like we need to test
this ADL ability of `rethrow_exception` in particular.
Reviewed as part of D119860.
We shouldn't be calling these functions via ADL -- and neither should anybody
in the wild be calling it via ADL, so it's not like we need to test
the ADL ability of these functions in particular.
Reviewed as part of D119860.
We shouldn't be calling `prev` via ADL -- and neither should anybody
in the wild be calling it via ADL, so it's not like we need to test
this ADL ability of `prev` in particular.
Reviewed as part of D119860.
We shouldn't be calling `next` via ADL -- and neither should anybody
in the wild be calling it via ADL, so it's not like we need to test
this ADL ability of `next` in particular.
Reviewed as part of D119860.
We shouldn't be calling `move` via ADL -- and neither should anybody
in the wild be calling it via ADL, so it's not like we need to test
this ADL ability of `move` in particular.
Reviewed as part of D119860.
This patch changes patchELFAllocatableRelaSections from going through
old relocations sections and update the relocation offsets to emitting
the relocations stored in binary sections. This is needed in case we
would like to remove and add dynamic relocations during BOLT work and it
is used by golang support pass. Note: Currently we emit relocations in
the old sections, so the total number of them should be equal or less
of old number.
Testing: No special tests are neeeded, since this patch does not fix
anything or add new functionality (it only prepares to add). Every
PIC-compiled test binary will use this code and thus become a test.
But just in case the aarch64 dynamic relocations tests were added.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D117612
By convention, memcpy/memmove intrinsics are always used with i8
pointers (though this is not enforced), so in practice this code
was always using an i8 type. Make that explicit.
Of course, i8 is not a very profitable choice, and this code could
be more performant by picking an appropriate larger type. But that
would require additional test coverage and correctness review, and
certainly shouldn't be a decision based on the pointer element type.
The named and generic address space overloads for atomic_init added
by 50f8abb9f4 ("[OpenCL] Add OpenCL 3.0 atomics to
-fdeclare-opencl-builtins", 2022-02-11) were not guarded by the
corresponding extensions.
Currently the fsub optimizations in InstSimplify don't know how to fold
X - -0.0 to X when we know X is not zero and the constrained intrinsics
are used. This adds the support.
This review is split out from D107285.
Differential Revision: https://reviews.llvm.org/D119746
Algorithm for hypotf: compute (a*a + b*b) in double precision, then use Dekker's algorithm to find the rounding error, and then correcting it after taking its square-root.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D118157
This relands 73e585e44d (and 0574b5fc65), with a fix for
the failing test (by using Optional<StringRef>s instead of
making StringRef::empty() mean absence of value).
Differential Revision: https://reviews.llvm.org/D118070
Address space casts in general may change the element type, but
don't allow it in the method working on Address, so we can
preserve the element type.
CreatePointerBitCastOrAddrSpaceCast() still needs to be addressed.