Summary:
28c91219c7 introduced an interceptor for `sigaltstack`. It turns out this
broke `setjmp` on i386 macOS. This is because the implementation of `setjmp` on
i386 macOS is written in assembly and makes the assumption that the call to
`sigaltstack` does not clobber any registers. Presumably that assumption was
made because it's a system call. In particular `setjmp` assumes that before
and after the call that `%ecx` will contain a pointer the `jmp_buf`. The
current interceptor breaks this assumption because it's written in C++ and
`%ecx` is not a callee-saved register. This could be fixed by writing a
trampoline interceptor to the existing interceptor in assembly that
ensures all the registers are preserved. However, this is a lot of work
for very little gain. Instead this patch just disables the interceptor
on i386 macOS.
For other Darwin architectures it currently appears to be safe to intercept
`sigaltstack` using the current implementation because:
* `setjmp` for x86_64 saves the pointer `jmp_buf` to the stack before calling `sigaltstack`.
* `setjmp` for armv7/arm64/arm64_32/arm64e appears to not call `sigaltstack` at all.
This patch should unbreak (once they are re-enabled) the following
tests:
```
AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer.LongJmpTest
AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer.SigLongJmpTest
AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer.LongJmpTest
AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer.SigLongJmpTest
AddressSanitizer-i386-darwin :: TestCases/longjmp.cpp
```
This patch introduces a `SANITIZER_I386` macro for convenience.
rdar://problem/62141412
Reviewers: kubamracek, yln, eugenis
Subscribers: kristof.beyls, #sanitizers, llvm-commits
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D82691
Somehow UBSan would only report the unaligned load in TestLinuxCore.py
when running the tests with reproducers. This patch fixes the issue by
using a memcpy in the GetDouble and the GetFloat method.
Differential revision: https://reviews.llvm.org/D83256
In the test based on PR46586:
https://bugs.llvm.org/show_bug.cgi?id=46586
...we are inserting 16-bits into the high element of the vector, shuffling it
to element 0, and extracting 32-bits. But xmm1 was never initialized, so the
top 16-bits of the extract are undef without this patch.
(It seems like we could do better than this by recognizing that we only demand
a subsection of the build vector, but I want to make sure we fix the
miscompile 1st.)
This path is only used for pre-SSE4.1, and simpler patterns get squashed
somewhere along the way, so the test still includes a 'urem' as it did in the
original test from the bug report.
Differential Revision: https://reviews.llvm.org/D83319
In GNU ld, --no-relax can disable x86-64 GOTPCRELX relaxation.
It is not useful, so we don't implement it.
For RISC-V, --no-relax disables linker relaxations which have larger
impact.
Linux kernel specifies --no-relax when CONFIG_DYNAMIC_FTRACE is specified
(since http://git.kernel.org/linus/a1d2a6b4cee858a2f27eebce731fbf1dfd72cb4e ).
LLD has not implemented the relaxations, so this option is a no-op.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D81359
The Ubuntu system ld does not recognize the amdgcn-amd-amdhsa target.
Instead the host object with embedded device fat binary should not be
assembled by that triple. It should use default triple, so that the
object is compatible with system ld.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D83145
Summary:
This patch aims to combine similar arm64 register set definitions defined in NativeRegisterContextLinux_arm64 and RegisterContextPOSIX_arm64.
I have implemented a register set interface out of RegisterInfoInterface class and moved arm64 register sets into RegisterInfosPOSIX_arm64 which is similar to Utility/RegisterContextLinux_* implemented by various other targets. This will help in managing register sets of new ARM64 architecture features in one place.
Built and tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabihf targets.
Reviewers: labath
Reviewed By: labath
Subscribers: mhorne, emaste, kristof.beyls, atanasyan, danielkiss, lldb-commits
Differential Revision: https://reviews.llvm.org/D80105
Adds a matcher called `hasDirectBase` for matching the `CXXBaseSpecifier` of a class that directly derives from another class.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D81552
These functions were doing a bitcast on the float value, which is not
consistent with the other getters, which were doing a numeric conversion
(47.0 -> 47). Change these to do numeric conversions too.
Summary: This patch makes code motion checks optional which are dependent on
specific analysis example, dominator tree, post dominator tree and dependence
info. The aim is to make the adoption of CodeMoverUtils easier for clients that
don't use analysis which were strictly required by CodeMoverUtils. This will
also help in diversifying code motion checks using other analysis example MSSA.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto
Reviewed By: Whitney
Subscribers: Prazek, hiraditya, george.burgess.iv, asbirlea, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D82566
Summary:
Import declarations in correct order if a class contains
multiple redundant friend (type or decl) declarations.
If the order is incorrect this could cause false structural
equivalences and wrong declaration chains after import.
Reviewers: a.sidorin, shafik, a_sidorin
Reviewed By: shafik
Subscribers: dkrupp, Szelethus, gamesh411, teemperor, martong, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75740
Summary:
Retaining per device primary context is preferred to creating a context owned by the plugin.
From CUDA documentation
1. Note that the use of multiple CUcontext s per device within a single process will substantially degrade performance and is strongly discouraged. Instead, it is highly recommended that the implicit one-to-one device-to-context mapping for the process provided by the CUDA Runtime API be used." from https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DRIVER.html
2. Right under cuCtxCreate. In most cases it is recommended to use cuDevicePrimaryCtxRetain. https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__CTX.html#group__CUDA__CTX_1g65dc0012348bc84810e2103a40d8e2cf
3. The primary context is unique per device and shared with the CUDA runtime API. These functions allow integration with other libraries using CUDA. https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__PRIMARY__CTX.html#group__CUDA__PRIMARY__CTX
Two issues are addressed by this patch:
1. Not using the primary context caused interoperability issue with libraries like cublas, cusolver. CUBLAS_STATUS_EXECUTION_FAILED and cudaErrorInvalidResourceHandle
2. On OLCF summit, "Error returned from cuCtxCreate" and "CUDA error is: invalid device ordinal"
Regarding the flags of the primary context. If it is inactive, we set CU_CTX_SCHED_BLOCKING_SYNC. If it is already active, we respect the current flags.
Reviewers: grokos, ABataev, jdoerfert, protze.joachim, AndreyChurbanov, Hahnfeld
Reviewed By: jdoerfert
Subscribers: openmp-commits, yaxunl, guansong, sstefan1, tianshilei1992
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D82718
Even non-remote targets may need to set the launch environment
((DY)LD_LIBRARY_PATH, specifically) to run successfully.
Also, add an assertion to better detect the case when launching a target
fails and the breakpoint is never hit.
The (previously-crashing) test-case would cause us to seemingly-harmlessly
replace some use with something else, but we can't replace it with itself,
so we would crash.
When an argument has 'byval' attribute and should be
passed on the stack according calling convention,
a stack copy would be emitted twice. This will cause
the real value will be put into stack where the pointer
should be passed.
Differential Revision: https://reviews.llvm.org/D83175
`MipsGOTParser` is a helper class that is used to dump MIPS GOT and PLT.
There is a problem with it: it might call report_fatal_error() on invalid input.
When this happens, the tool reports a crash:
```
# command stderr:
LLVM ERROR: Cannot find PLTGOT dynamic table tag.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backt
race.
Stack dump:
...
```
Such error were not tested. In this patch I've refactored `MipsGOTParser`:
I've splitted handling of GOT and PLT to separate methods. This allows to propagate
any possible errors to caller and should allow to dump the PLT when something is wrong
with the GOT and vise versa in the future.
I've added tests for each `report_fatal_error()`
and now calling the `reportError` instead. In the future we might want to switch to
reporting warnings, but it requres the additional testing and should
be performed independently.
I've kept `unwrapOrError` calls untouched for now as I'd like to focus on eliminating
`report_fatal_error` calls in this patch only.
Differential revision: https://reviews.llvm.org/D83225
Currently, llvm-readobj calls `report_fatal_error` when an object has
both REL and RELA dynamic relocations.
llvm-readelf is able to handle this case properly. This patch adds such a test case
and adjusts the llvm-readobj code to follow (and be consistent with its own RELR and PLTREL cases).
Differential revision: https://reviews.llvm.org/D83232
Summary:
.clangd/index was well-intentioned in 2754942cba, but `.clangd` is the best
filename for the clangd config file (matching .clang-format and .clang-tidy).
And of course we can't have both .clangd/index and .clangd...
There are a few overlapping goals to satisfy:
- it should be clear from the directory name that this is transient
data that is safe to delete at the cost of recomputation, i.e. a cache
- it should be easy and self-documenting to blacklist these files in .gitignore
- we should have some consistency between filenames in-tree and
corresponding files in user storage (e.g. under XDG's ~/.cache/)
- we should be consistent across platforms (including windows, which
doesn't have distinct cache vs config directories)
So the plan is:
$PROJECT/.clangd (project config)
$PROJECT/.cache/clangd/index/ (project index)
$PROJECT/.cache/clangd/modules/ (maybe in future)
$XDG_CONFIG_HOME/clangd/config.yaml (user config)
$XDG_CACHE_HOME/clangd/index/ (index of non-project files)
$XDG_CACHE_HOME/clangd/modules/ (maybe in future)
This is sensible if XDG_{CONFIG,CACHE}_HOME coincide, and has a simple
.gitignore rule going forward: `.cache/`.
The monorepo gitignore is updated to reflect the backwards-compatible practice:
ignore .clangd/ (with trailing slash) matching index files from clangd 9/10
ignore .cache matching index from clangd 11+, and potentially other tools.
The entries from llvm-project/llvm gitignore are removed (obsolete).
Reviewers: kadircet, hokein
Subscribers: ilya-biryukov, MaskRay, jkorous, omtcyfz, arphaman, usaxena95, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D83099
Similar to OwningModuleRef, OwningSPIRVModuleRef signals ownership
transfer clearly. This is useful for APIs like spirv::deserialize,
where a spirv::ModuleOp is returned by deserializing SPIR-V binary
module.
This addresses the ASAN error as reported in
https://bugs.llvm.org/show_bug.cgi?id=46272
Differential Revision: https://reviews.llvm.org/D81652
If a loop is in a function marked OptSize, Loop Access Analysis should refrain
from generating runtime checks for unit strides that will version the loop.
If a loop is in a function marked OptSize and its vectorization is enabled, it
should be vectorized w/o any versioning.
Fixes PR46228.
Differential Revision: https://reviews.llvm.org/D81345
io.BytesIO seems to produce a stream in Python 2 which isn't recognized
as a file object in the SWIG API, so this test fails for Python 2 (and I assume
also an old SWIG version needs to be involved).
Instead just open an empty input file which is a file object in all Python
versions to make this test pass everywhere.
It is possible to:
1) Avoid using the `unwrapOrError` calls and hence allow to continue dumping even when
something is not OK with one of SHT_LLVM_LINKER_OPTIONS sections.
2) replace `reportWarning` with `reportUniqueWarning` calls. In this method it is no-op,
because it is not possible to have a duplicated warnings anyways, but since we probably
want to switch to `reportUniqueWarning` globally, this is a good thing to do.
This patch addresses both these points.
Differential revision: https://reviews.llvm.org/D83131
This introduces `printHashTableSymbols` and
`printGNUHashTableSymbols` to split the `printHashSymbols`.
It makes the code more readable and consistent.
Differential revision: https://reviews.llvm.org/D83040
Summary:
When splitting a store of a scalable type, the new address is
calculated in SplitVecOp_STORE using a vscale and an add instruction.
Reviewers: sdesmalen, efriedma, david-arm
Reviewed By: david-arm
Subscribers: tschuett, hiraditya, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83041
This is a followup for D83129.
It is possible to make `getStaticSymbolName` report warnings inside
and return the "<?>" on a error. This allows to encapsulate errors handling
and slightly simplifies the logic in callers code.
Differential revision: https://reviews.llvm.org/D83208
The code we have currently reports an error if something is not right with the
profile section. Instead we can report a warning and continue dumping when it is possible.
This patch does it.
Differential revision: https://reviews.llvm.org/D83129
Summary:
When splitting a load of a scalable type, the new address is
calculated in SplitVecRes_LOAD using a vscale and an add instruction.
This patch also adds a DAG combiner fold to visitADD for vscale:
- Fold (add (vscale(C0)), (vscale(C1))) to (add (vscale(C0 + C1)))
Reviewers: sdesmalen, efriedma, david-arm
Reviewed By: david-arm
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82792
Summary:
Unify the code for requiring a complete type and move it into a single
place. The only functional change is that the "cannot start a definition
of an incomplete type" is upgrated from a runtime error/warning to an
lldbassert. An plain assert might also be fine, since (AFAICT) this can
only happen in case of a programmer error.
Reviewers: teemperor, aprantl, shafik
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D83199
We might crash when the dynamic symbols table is empty (or not found)
and --hash-symbols is requested. Both .hash and .gnu.hash logic is affected.
The patch fixes this issue.
Differential revision: https://reviews.llvm.org/D83037
Summary:
This patch enhances parser support for flush construct to OpenMP 5.0 by including memory-order-clause.
2.18.8 flush Construct
!$omp flush [memory-order-clause] [(list)]
where memory-order-clause is
acq_rel
release
acquire
The patch includes code changes and testcase modifications.
Reviewed By: klausler, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D82177
There are now more SVE tests in LLVM and Clang that do not
emit warnings related to invalid use of EVT::getVectorNumElements()
and VectorType::getNumElements(). For these tests I have added
additional checks that there are no warnings in order to prevent
any future regressions.
Differential Revision: https://reviews.llvm.org/D82943
In an earlier commit 584d0d5c17 I
added functionality to allow AArch64 CodeGen support for falling
back to DAG ISel when Global ISel encounters scalable vector
types. However, it seems that we were not falling back early
enough as llvm::getLLTForType was still being invoked for scalable
vector types.
I've added a new fallback function to the call lowering class in
order to catch this problem early enough, rather than wait for
lowerFormalArguments to reject scalable vector types.
Differential Revision: https://reviews.llvm.org/D82524
This patch fixes all remaining warnings in:
llvm/test/CodeGen/AArch64/sve-trunc.ll
llvm/test/CodeGen/AArch64/sve-vector-splat.ll
I hit some warnings related to getCopyPartsToVector. I fixed two
issues:
1. In widenVectorToPartType() we assumed that we'd always be
using BUILD_VECTOR nodes to expand from one vector type to another,
which is incorrect for scalable vector types. I've fixed this for now
by simply bailing out immediately for scalable vectors.
2. In getCopyToPartsVector() I've changed the code to compare
the element counts of different types.
Differential Revision: https://reviews.llvm.org/D83028
'64bit' shows up from -march=native on 64-bit capable CPUs.
'retpoline-eternal-thunk' isn't a real feature but shows up
when -mretpoline-external-thunk is passed to clang.