Refactor src/math/hypotf.cpp and test/src/math/hypotf_test.cpp and reuse them for hypot and hypot_test
Differential Revision: https://reviews.llvm.org/D91831
This also teaches MachO writers/readers about the MachO cpu subtype,
beyond the minimal subtype reader support present at the moment.
This also defines a preprocessor macro to allow users to distinguish
__arm64__ from __arm64e__.
arm64e defaults to an "apple-a12" CPU, which supports v8.3a, allowing
pointer-authentication codegen.
It also currently defaults to ios14 and macos11.
Differential Revision: https://reviews.llvm.org/D87095
When using accumulators in loops, they are passed around in PHI nodes of unprimed
accumulators, causing the generation of additional prime/unprime instructions.
This patch detects these cases and changes these PHI nodes to primed accumulator
PHI nodes. We also add IR and MIR test cases for several PHI node cases.
Differential Revision: https://reviews.llvm.org/D91391
Implement fetch_<op>/fetch_and_<op>/exchange/compare-and-exchange
instructions for BPF. Specially, the following gcc intrinsics
are implemented.
__sync_fetch_and_add (32, 64)
__sync_fetch_and_sub (32, 64)
__sync_fetch_and_and (32, 64)
__sync_fetch_and_or (32, 64)
__sync_fetch_and_xor (32, 64)
__sync_lock_test_and_set (32, 64)
__sync_val_compare_and_swap (32, 64)
For __sync_fetch_and_sub, internally, it is implemented as
a negation followed by __sync_fetch_and_add.
For __sync_lock_test_and_set, despite its name, it actually
does an atomic exchange and return the old content.
https://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Atomic-Builtins.html
For intrinsics like __sync_{add,sub}_and_fetch and
__sync_bool_compare_and_swap, the compiler is able to generate
codes using __sync_fetch_and_{add,sub} and __sync_val_compare_and_swap.
Similar to xadd, atomic xadd, xor and xxor (atomic_<op>)
instructions are added for atomic operations which do not
have return values. LLVM will check the return value for
__sync_fetch_and_{add,and,or,xor}.
If the return value is used, instructions atomic_fetch_<op>
will be used. Otherwise, atomic_<op> instructions will be used.
All new instructions only support 64bit and 32bit with alu32 mode.
old xadd instruction still supports 32bit without alu32 mode.
For encoding, please take a look at test atomics_2.ll.
Differential Revision: https://reviews.llvm.org/D72184
This can happen when, for example, merging results from an external
index that generates IncludeHeaders with full URI rather than just
literal include.
Differential Revision: https://reviews.llvm.org/D92494
1. Removed #include "...AliasAnalysis.h" in other headers and modules.
2. Cleaned up includes in AliasAnalysis.h.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D92489
shouldRTTIBeUnique() returns false for iOS64CXXABI, which causes
RTTI objects to be emitted hidden. Update two tests that didn't
expect this to happen for the default triple.
Also rename iOS64CXXABI to AppleARM64CXXABI, since it's used for
arm64-apple-macos triples too.
Part of PR46644.
Differential Revision: https://reviews.llvm.org/D91904
Memrefs with affine_map in the results of normalizable operation were
not normalized by `--normalize-memrefs` option. This patch normalizes
them.
Differential Revision: https://reviews.llvm.org/D88719
This morally reverts D82777 -- turns out that ld64 crashes with many
response files, so we must stop passing them to it until the crash is
fixed.
Differential Revision: https://reviews.llvm.org/D92357
The problem was that `sym` became replaced in the call
to make<ObjFile> and referring to it afer that read memory that now
stored a different kind of symbol (a Defined instead of a LazySymbol).
Since this happens only once per archive, just copy the symbol to the
stack before make<ObjFile> and read the copy instead.
Originally reviewed at https://reviews.llvm.org/D92496
Move the two different definitions of FUNC_ALIGN out of the ELF
specific block. Add the missing CFI_END in
END_COMPILERRT_OUTLINE_FUNCTION, to go with the corresponding CFI_START
in DEFINE_COMPILERRT_OUTLINE_FUNCTION_UNMANGLED.
Differential Revision: https://reviews.llvm.org/D92549
We have a plan to add libcxx and libcxxabi for VE. In order to do so,
we need to compile cxx source code with bootstarapped header files.
This patch adds such expected path to make clang++ work, at least
not crash at the startup. Add regression test for that, also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D92386
New tes cases added. Change var names to avoid the following warning from update_test_checks.py:
WARNING: Change IR value name 'tmp5' to prevent possible conflict with scripted FileCheck name.
Reviewed By: ebrevnov
Differential Revision: https://reviews.llvm.org/D92566
Generate checks with update_test_checks.py in order to simplify upcoming updates.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D92561
[libomptarget][amdgpu] Address compiler warnings, drive by fixes
Initialize some variables, remove unused ones.
Changes the debug printing condition to align with the aomp test suite.
Differential Revision: https://reviews.llvm.org/D92559
This reverts commit 0c9c6ddf17.
We are seeing some failures with this patch locally. Not clear
if it's causing them or just triggering a problem in another
place. Reverting while investigating.
Extended promote buffers to stack pass to support dynamically shaped allocas.
The conversion is limited by the rank of the underlying tensor.
An option is added to the pass to adjust the given rank.
Differential Revision: https://reviews.llvm.org/D91969
Also:
* Fix header line in all status tables.
* Use C++20 instead of C++2a.
Reviewed By: ldionne, #libc, miscco
Differential Revision: https://reviews.llvm.org/D92306
This adds a test for the bug
https://bugs.llvm.org/show_bug.cgi?id=47591
Previously, selection dag has a bug which may incorrectly
assume no alias when crossing a lifetime boundary and this
may generate incorrect code as demonstrated in the above bug.
It looks the bug is fixed by https://reviews.llvm.org/D91833.
Basically, when comparing two potential memory access dag nodes,
a store and a lifetime.start,
with the same frame index.
Previously, it may be decided no alias. With the above fix,
these two will be considered aliasing which will prevent
incorrect code scheduling.
Differential Revision: https://reviews.llvm.org/D92451
This is a child diff of D92261.
After supporting field/index-level shadow, the existing shadow with type
i16 works for only primitive types.
Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D92459
PowerPC ISA support the input test for vector type v4f32 and v2f64.
Replace the software compare with hw test will improve the perf.
Reviewed By: ChenZheng
Differential Revision: https://reviews.llvm.org/D90914
Fix bogus diagnostics that would get confused and think a "no viable
fuctions" case was an "undeclared identifiers" case, resulting in an
incorrect diagnostic preceding the correct one. Use overload resolution
to determine which function we should select when we can find call
candidates from a dependent base class. Make the diagnostics for a call
that could call a function from a dependent base class more specific,
and use a different diagnostic message for the case where the call
target is instead declared later in the same class. Plus some minor
diagnostic wording improvements.