Use the cpu subtype enum values from llvm::MachO in the ArchSpec MachO
table. As I'm already cluttering the history, restore the table's
formatting to its original glory.
Differential revision: https://reviews.llvm.org/D92601
This change should be fairly straight forward. If we've reached a call, check to see if we can tell the result is dereferenceable from information about the minimum object size returned by the call.
To control compile time impact, I'm only adding the call for base facts in the routine. getObjectSize can also do recursive reasoning, and we don't want that general capability here.
As a follow up patch (without separate review), I will plumb through the missing TLI parameter. That will have the effect of extending this to known libcalls - malloc, new, and the like - whereas currently this only covers calls with the explicit allocsize attribute.
Differential Revision: https://reviews.llvm.org/D90341
Since there is no ROCm Device Library support for
long double, demote them to double, and use the fp64
math functions.
Differential Revision: https://reviews.llvm.org/D92130
The initial step of the uniform-after-vectorization (lane-0 demanded only) analysis was very awkwardly written. It would revisit use list of each pointer operand of a widened load/store. As a result, it was in the worst case O(N^2) where N was the number of instructions in a loop, and had restricted operand Value types to reduce the size of use lists.
This patch replaces the original algorithm with one which is at most O(2N) in the number of instructions in the loop. (The key observation is that each use of a potentially interesting pointer is visited at most twice, once on first scan, once in the use list of *it's* operand. Only instructions within the loop have their uses scanned.)
In the process, we remove a restriction which required the operand of the uniform mem op to itself be an instruction. This allows detection of uniform mem ops involving global addresses.
Differential Revision: https://reviews.llvm.org/D92056
We found out that we have clients relying on the old signature of the
Symbolicator initializer. Make the signature compatible again and
provide a factory method to initialize the class correctly based on
whether you have a target or want the symbolicator to create one for
you.
Differential revision: D92601
Make sure we recognize cpu sub-type 2 as arm64. In reality it's arm64e,
but we don't have the triple for that. Without this patch, we fall back
to unknown-apple-macosx- for the default architecture, which breaks
things like running expressions without a target.
Differential revision: https://reviews.llvm.org/D92603
We were keeping the state of parsed equivalence sets in the class
DeclarationVisitor. A problem happened when analyzing the the specification
part of a declaration that contained an EQUIVALENCE statement followed by an
interface block. The same DeclarationVisitor object that was created for the
outer declaration was being used to analyze the specification part
of a procedure body in the interface block. When analyzing the specification
part of the procedure in the interface block, the names in the outer
declaration's EQUIVALENCE statement were erroneously compared with the names in
the arguments of the interface procedure. This resulted in a bogus error
message.
I fixed this by not checking equivalence sets when we're in an interface
block. I also added a test that will produce an error message without
this change.
Differential Revision: https://reviews.llvm.org/D92501
Some C++20 headers weren't added properly to all three of these
test files. Add them, and take the time to normalize the formatting
so that
diff <(grep '#include' foo.cpp) <(grep '#include' bar.cpp)
shows no diffs (except that `no_assert_include` deliberately
excludes `<cassert>`).
- Add macro guards to <{barrier,latch,semaphore}>.
- Add macro guards to <experimental/simd>.
- Remove an include of <cassert> from <semaphore>.
- Instead, include <cassert> in the semaphore tests.
Differential Revision: https://reviews.llvm.org/D92525
* Rename some tests to try to make a convention (where all components
are optional) of:
<addrspace>_<syncscope>_<memory-orders>_<operation>
* Split up at a level of granularity appropriate for the different RUN
lines (i.e. split on addrspace so GFX6 can avoid FLAT) and that makes
running a specific test reasonable in terms of wall time taken. This
also means when run as part of the test suite the testing is not one
serial bottleneck.
* Auto-generate check lines with `update_llc_test_checks.py` to make
future maintenance more tractable.
Reviewed By: rampitec, t-tye
Differential Revision: https://reviews.llvm.org/D91545
Rather than having a different opcode for RV32 and RV64. Let's just say the integer type is XLenVT and use a single opcode for both modes.
Differential Revision: https://reviews.llvm.org/D92538
Use a fast path for column width computation for ascii characters. Especially
relevant for llvm-objdump.
before:
% time ./bin/llvm-objdump -D -j .text /lib/libc.so.6 >/dev/null
./bin/llvm-objdump -D -j .text /lib/libc.so.6 > /dev/null 0.75s user 0.01s system 99% cpu 0.757 total
after:
% time ./bin/llvm-objdump -D -j .text /lib/libc.so.6 >/dev/null
./bin/llvm-objdump -D -j .text /lib/libc.so.6 > /dev/null 0.37s user 0.01s system 99% cpu 0.378 total
Differential Revision: https://reviews.llvm.org/D92180
On AArch64 it allows use the native FP16 ABI (although libcalls are
not emitted for fptrunc/fpext lowering), while on other architectures
the expected current semantic is preserved (arm for instance).
For testing the _Float16 usage is enabled by architecture base,
currently only for arm, aarch64, and arm64.
This re-enabled revert done by https://reviews.llvm.org/rGb534beabeed3ba1777cd0ff9ce552d077e496726
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D92241
The conditional guarding createInitMemoryFunction was incorrect and
didn't match that guarding the creation of the associated symbol.
Rather that reproduce the same conditions in multiple places we can
simply use the presence of the associated symbol.
Also, add an assertion that would have caught this bug.
Also, add a new test for this flag combination.
This is part of an ongoing effort to enable dynamic linking with
threads in emscripten.
See https://github.com/emscripten-core/emscripten/issues/3494
Differential Revision: https://reviews.llvm.org/D92520
There is a library layering issue. LLVMAnalysis provides llvm/Analysis/ScopedNoAliasAA.h and depends on LLVMCore.
LLVMCore provides llvm/IR/Metadata.cpp and it should not include a header file in LLVMAnalysis
This fixes the build on gcc5 toolchain where sizeof is required for
types used in SmallVector now.
This is a consequence of using std::is_trivially_copy_constructible
instead of the LLVM variant: https://reviews.llvm.org/D92543
This reduces the chances of segfault. While it is a good practice to ensure
robust custom printers, it is unfortunately common to have them crash on
invalid input.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D92536
Right now, the regex expression will fail if the flags were not set. Instead, we should follow the pattern of other llvm projects and quote the expression, so that it can work even when the flags are not set.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D92586
LLVM passes overaligned objects by value, which MSVC 19.1 didn't support on
x86_32. MSVC added this support somewhere between 19.1 and 19.14, but godbolt
doesn't have 19.11, 19.12, or 19.13 so I can't test before 19.14:
https://gcc.godbolt.org/z/75YoEz
Even if users are using the Visual Studio 2017 series of Visual C++ toolchains,
they should've already updated to 19.14 or newer at this point, or they
wouldn't be able to build LLVM. This just raises the CMake required minimum
version so the build fails earlier.
Differential Revision: https://reviews.llvm.org/D92515
Internally the pass skips any function with the optnone attribute. But that still requires checking each function. If the opt level is set to None we might as well just skip putting in the pipeline at all. This what is already done for many of the passes added by TargetPassConfig.
Differential Revision: https://reviews.llvm.org/D92511
When MemCpyOpt performs call slot optimization it will concatenate the `alias.scope` metadata between the function call and the memcpy. However, scoped AA relies on the domains in metadata to be maintained in a caller-callee relationship. Naive concatenation breaks this assumption leading to bad AA results.
The fix is to take the intersection of domains then union the scopes within those domains.
The original bug came from a case of rust bad codegen which uses this bad aliasing to perform additional memcpy optimizations. As show in the added test case `%src` got forwarded past its lifetime leading to a dereference of garbage data.
Testing
ninja check-llvm
Reviewed By: jeroen.dobbelaere
Differential Revision: https://reviews.llvm.org/D91576
This patch removes the variants of DecodeVPERMVMask and
DecodeVPERMV3Mask that take "const Constant *C" as they are not used
anymore.
They were introduced on Sep 8, 2015 in commit
e88038f235.
The last use of DecodeVPERMVMask(const Constant *C, ...) was removed
on Feb 7, 2016 in commit 73fc26b44a.
The last use of DecodeVPERMV3Mask(const Constant *C, ...) was removed
on May 28, 2018 in commit dcfcfdb0d1.
Differential Revision: https://reviews.llvm.org/D91926
This is needed for cross-compiling LLVM from Cygwin, but it had gotten
deleted in rG2724d9e12960cc1d93eeabbfc9aa1bffffa041cc
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D92336
Add unit tests for functions in LLVMFrontendOpenACC. As notice in D91470 these functions were not tested
as well as the ones for OpenMP (D91643). This patch add tests for the OpenACC part.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D91653
Refactor src/math/hypotf.cpp and test/src/math/hypotf_test.cpp and reuse them for hypot and hypot_test
Differential Revision: https://reviews.llvm.org/D91831
This also teaches MachO writers/readers about the MachO cpu subtype,
beyond the minimal subtype reader support present at the moment.
This also defines a preprocessor macro to allow users to distinguish
__arm64__ from __arm64e__.
arm64e defaults to an "apple-a12" CPU, which supports v8.3a, allowing
pointer-authentication codegen.
It also currently defaults to ios14 and macos11.
Differential Revision: https://reviews.llvm.org/D87095
When using accumulators in loops, they are passed around in PHI nodes of unprimed
accumulators, causing the generation of additional prime/unprime instructions.
This patch detects these cases and changes these PHI nodes to primed accumulator
PHI nodes. We also add IR and MIR test cases for several PHI node cases.
Differential Revision: https://reviews.llvm.org/D91391