The x86-64 backend currently has a bug which uses a wrong register when for the GOTPCREL reference.
The program will crash without the dso_local specifier.
On the surface this would be slightly less optimal for the isel
table, but due to a tablegen issue with HW mode this ends up
generating a smaller isel table.
AddressSanitizer instrumentation does not set dso_local on non-thread-local
global variables in -fno-pic and it seems to rely on implied dso_local to work.
Add a hack until we have fixed AddressSanitizer to call setDSOLocal() as
appropriate.
Thanks to Vitaly Buka for reporting the issue and suggesting the way to detect asan.
The basic idea is that by looking through operand instructions which don't change the equality result that we can push the existing known bits comparison down past instructions which would obscure them.
We have analogous handling in InstSimplify for most - though weirdly not all - of these cases starting from an icmp root. It's a bit unfortunate to duplicate logic, but since my actual goal is to extend BasicAA, the icmp logic doesn't help. (And just makes it hard to test here.) The BasicAA change will be posted separately for review.
Differential Revision: https://reviews.llvm.org/D92698
This does not deserve special handling. The code should be added to Clang
instead if deemed useful. With this simplification, we can additionally delete
the PIC extern_weak special case.
With my previous commit, X86Subtarget::classifyGlobalReference has learned to
use MO_NO_FLAG for 32-bit ELF -fno-pic code, the x86-32 special case in
TargetMachine::shouldAssumeDSOLocal can be removed. Since we no longer imply
dso_local for function declarations, we can drop the ppc64 special case as well.
This is NFC in terms of Clang emitted assembly.
clang/lib/CodeGen/CodeGenModule sets dso_local on applicable function declarations,
we don't need to duplicate the work in TargetMachine:shouldAssumeDSOLocal.
(Actually the long-term goal (started by r324535) is to drop TargetMachine::shouldAssumeDSOLocal.)
By not implying dso_local, we will respect dso_local/dso_preemptable specifiers
set by the frontend. This allows the proposed -fno-direct-access-external-data
option to work with -fno-pic and prevent a canonical PLT entry (SHN_UNDEF with non-zero st_value)
when taking the address of a function symbol.
This patch should be NFC in terms of the Clang emitted assembly because the case
we don't set dso_local is a case Clang sets dso_local. However, some tests don't
set dso_local on some function declarations and expose some differences. Most
tests have been fixed to be more robust in the previous commit.
They are currently implicit because TargetMachine::shouldAssumeDSOLocal implies
dso_local.
For such function declarations, clang -fno-pic emits the dso_local specifier.
Adding explicit dso_local makes these tests align with the clang behavior and
helps implementing an option to use GOT indirection when taking the address of a
function symbol in -fno-pic (to avoid a canonical PLT entry (SHN_UNDEF with
non-zero st_value)).
Due to the recursion through phis basicaa does, the code needs to be extremely careful not to reason about equality between values which might represent distinct iterations. I'm generally skeptical of the correctness of the whole scheme, but this particular patch fixes one particular instance which is demonstrateable incorrect.
Interestingly, this appears to be the second attempted fix for the same issue. The former fix is incomplete and doesn't address the actual issue.
Differential Revision: https://reviews.llvm.org/D92694
This essentially reverts the x86-64 side effect of r327198.
For x86-32, @PLT (R_386_PLT32) is not suitable in -fno-pic mode so the
code forces MO_NO_FLAG (like a forced dso_local) (https://bugs.llvm.org//show_bug.cgi?id=36674#c6).
For x86-64, both `call/jmp foo` and `call/jmp foo@PLT` emit R_X86_64_PLT32
(https://sourceware.org/bugzilla/show_bug.cgi?id=22791) so there is no
difference using @PLT. Using @PLT is actually favorable because this drops
a difference with -fpie/-fpic code and makes it possible to avoid a canonical
PLT entry when taking the address of an undefined function symbol.
LLVMBuild has been removed from the build system. However, three LLVMBuild.txt
files remain in the tree. This patch simply removes them.
llvm/lib/ExecutionEngine/Orc/TargetProcess/LLVMBuild.txt
llvm/tools/llvm-jitlink/llvm-jitlink-executor/LLVMBuild.txt
llvm/tools/llvm-profgen/LLVMBuild.txt
Differential Revision: https://reviews.llvm.org/D92693
The function accrues many `GV` nullness checks. Process `!GV`
(ExternalSymbolSDNode) early to simplify code.
Also improve a comment added in r327198 (intrinsics is a subset of
ExternalSymbolSDNode).
Intended to be NFC.
D91692 missed various locations in kmp_gsupport, where the scope for
OMPT_STORE_RETURN_ADDRESS is too narrow, i.e. the scope ends before the OMPT
callback is called in some nested function.
This patch fixes the scoping issue, so that all OMPT tests pass, when the
tests are built with gcc.
Differential Revision: https://reviews.llvm.org/D92121
This is the the minimal change introduced in [[ https://reviews.llvm.org/D88599 | D88599 ]] to unblock the controversial change and discussion of proper separation between thread from thread id which will continue in D88599.
This patch will address the differences of definition of pthread_t on z/OS vs. Linux and other OS. Main trick to make the code work on z/OS relies on redefining libcpp_thread_id type and _LIBCPP_NULL_THREAD macro. This is necessary to separate initialization of libcxx_thread_id from the one of __libcxx_thread_t;
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D91875
I use several of the clang-format clean directories as a test suite, this one had got slightly out of wack in a prior commit
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D92666
PPCMCInstLower does not actually call shouldAssumeDSOLocal for ppc32 so this is dead.
Actually Clang ppc32 does produce a pair of absolute relocations which match GCC.
This also fixes a comment (R_PPC_COPY and R_PPC64_COPY do exist).
Avoid calling getFlags on a non-existent symbol.
The way this is triggered is by calling strip -N on a binary, which sets
the MH_NLIST_OUTOFSYNC_WITH_DYLDINFO header flag. Then, in the
LC_FUNCTION_STARTS command, nm is trying to print the stripped symbols
and needs the proper checks.
IsSigned and its accessor, isSigned, were introduced on Oct 25, 2017
in commit 9ac7021a25. The last use was
removed on Nov 20, 2017 in commit
268467869b.
Trailing objects are really nice for storing additional data inline with the main class, and is something that we heavily take advantage of for Operation(and many other classes). To get the address of the inline data you need to compute the address by doing some pointer arithmetic taking into account any objects stored before the object you want to access. Most classes keep the count of the number of objects, so this is relatively cheap to compute. This is not the case for results though, which have two different types(inline and trailing) that are not necessarily as cheap to compute as the count for other objects. This revision moves the storage for results to before the operation and stores them in reverse order. This allows for getting results to still be very fast given that they are never iterated directly in order, and also greatly improves the speed when accessing the other trailing objects of an operation(operands/regions/blocks/etc).
This reduced compile time when compiling a decently sized mlir module by about ~400ms, or 2.17s -> 1.76s.
Differential Revision: https://reviews.llvm.org/D92687
The check for formatting enum attributes was missing a call to get the base attribute, which is necessary to strip off the top-level OptionalAttr<> wrapper.
Differential Revision: https://reviews.llvm.org/D92713
This patch fixes builtins' CMakeLists.txt and their VFP tests to check
the standard macro defined in the ACLE for VFP support. It also enables
the tests to be built and run for single-precision-only targets while
builtins were built with double-precision support.
Differential revision: https://reviews.llvm.org/D92497
Use the newly added spawnattr API, posix_spawnattr_setarchpref_np, to
select a slice preferences per cpu and subcpu types, instead of just cpu
with posix_spawnattr_setarchpref_np.
rdar://16094957
Differential revision: https://reviews.llvm.org/D92712