It happened that the LBR entry target can be the first address of text section which causes an out-of-range crash. So here add a boundary check.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D110271
Instead of overloading `__to_address`, let's specialize `pointer_traits`.
Function overloads need to be in scope at the point where they're called,
whereas template specializations do not. (User code can provide pointer_traits
specializations to be used by already-included library code, so obviously
`__wrap_iter` can do the same.)
`pointer_traits<__wrap_iter<It>>` cannot provide `pointer_to`, because
you generally cannot create a `__wrap_iter` without also knowing the
identity of the container into which you're trying to create an iterator.
I believe this is OK; contiguous iterators are required to provide
`to_address` but *not* necessarily `pointer_to`.
Differential Revision: https://reviews.llvm.org/D110198
This is a simple version without the possibility to define distribute
points or followup-transformations. However, it is the first
transformation that has to check whether the transformation is correct.
It interprets the same metadata as the LoopDistribute pass.
intercept-rethrow-exception.cc fails when running runtimes tests if linking in
a hermetic libc++abi. This is because if libc++abi is used, then asan expects
to intercept __cxa_rethrow_primary_exception on linux, which should unpoison the
stack. If we statically link in libc++abi though, it will contain a strong
definition for __cxa_rethrow_primary_exception which wins over the weakly
defined interceptor provided by asan, causing the test to fail by not unpoisoning
the stack on the exception being thrown.
It's likely no one has encountered this before and possible that upstream tests
opt for dynamically linking where the interceptor can work properly. An ideal
long term solution would be to update the interceptor and libc++[abi] APIs to
work for this case, but that will likely take a long time to work out. In the
meantime, since the test isn't necessarily broken, we can just add another
REQUIRES check to make sure that it's only run if we aren't statically linking
in libc++abi.
Differential Revision: https://reviews.llvm.org/D109938
Without preinliner, we need to tune down the cold count cutoff to merge/trim more context to limit profile size for large components. However it doesn't make sense for cold threshold to be higher than hot threshold, so we now change to use hot threshold as merging/trimming cut off instead.
Differential Revision: https://reviews.llvm.org/D110212
This test makes sure kernels map to efficient sparse code, i.e. all
compressed for-loops, no co-iterating while loops. In addition, this
revision removes the special constant folding inside the sparse
compiler in favor of Mahesh' new generic linalg folding. Thanks!
NOTE: relies on Mahesh fix, which needs to be rebased first
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110001
A pointer with subscripts, substring indices, or components cannot
be initialized by a DATA statement (although of course a whole pointer
can be so). Catch the missing cases.
Differential Revision: https://reviews.llvm.org/D109931
While at it, add the diagnosis message "left operand of comma operator has no effect" (used by GCC) for comma operator.
This also makes Clang diagnose in the constant evaluation context which aligns with GCC/MSVC behavior. (https://godbolt.org/z/7zxb8Tx96)
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D103938
By not using ADDIW we can cause both an ADDIW and ADDI to be emitted
when the add has multiple users.
These instructions needed be added to the list of instructions that
only use the lower 32 bits of input.
I've also added tests for the wu versions, but I'm having trouble
showing bad codegen from it.
When both a DefaultValuedAttr and a successor or variadic region was specified, this would generate invalid C++ declaration. There would be the parameter with a default value, followed by the successors/regions, which don't have a default, which is invalid.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110205
This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110279
Currently, linux kernel has a __user attribute ([1]) defined as
__attribute__((noderef, address_space(__user)))
which is used by sparse tool ([2]) to do some
type checking of pointers to user space memory.
During normal compilation, __user will be defined
to nothing so it won't have an impact on compilation.
The btf_tag attribute, which is motivated by
carrying linux kernel annotations into dwarf/BTF,
is introduced in [3]. We intended to define __user as
__attribute__((btf_tag("user")))
so such information will be encoded in dwarf/BTF
and can be used later by bpf verification or other
tracing tools.
But linux kernel __user attribute is also used during
type conversion which btf_tag doesn't support ([4]) since
such type conversion is only used for compiler analysis
and not encoded in dwarf/btf. Theoretically, it is
possible for clang to understand these tags and
do a sparse-like type checking work. But I would like
to leave that to future work and for now suggest simply
ignore these btf_tag attributes if they are used
as type attributes.
[1] https://github.com/torvalds/linux/blob/master/include/linux/compiler_types.h#L10
[2] https://sparse.docs.kernel.org/en/latest/
[3] https://reviews.llvm.org/D106614
[4] https://github.com/torvalds/linux/blob/master/fs/binfmt_flat.c#L135
Differential Revision: https://reviews.llvm.org/D110116
The current folder of constant -> generic op only handles splat
constants. The same logic holds for scalar constants. Teach the
pattern to handle such cases.
Differential Revision: https://reviews.llvm.org/D109982
This fixes a bug where we discover new information about the arguments of an
already executable edge, but don't visit the arguments. We only visit the arguments, and not the block itself, so this commit shouldn't really affect performance at all.
Fixes PR#51871
Differential Revision: https://reviews.llvm.org/D110197
Should reset the operation to original state when canceling the updates.
Reviewed By: rriddle, ftynse
Differential Revision: https://reviews.llvm.org/D110176
All supported compilers provide support for inline variables in C++17 now.
Also, as a fly-by fix, replace some uses of _LIBCPP_CONSTEXPR by just
constexpr.
The only exception in this patch is `std::ignore`, which is provided
prior to C++17. Since it is defined in an anonymous namespace, it always
has internal linkage anyway, so using an inline variable there doesn't
provide any benefit. Instead, `inline` was removed entirely on `std::ignore`.
Differential Revision: https://reviews.llvm.org/D110243
Now not just SUM, but also PRODUCT, AND, OR, XOR. The reductions
MIN and MAX are still to be done (also depends on recognizing
these operations in cmp-select constructs).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D110203
(As I mentioned in https://reviews.llvm.org/D62609#1534158 ,
the condition for using bti c for executable can be loosened.)
In two cases the address of a PLT may escape:
* canonical PLT entry for a STT_FUNC
* non-preemptible STT_GNU_IFUNC which is converted to STT_FUNC
The first case can be detected with `needsPltAddr`.
The second case is not straightforward to detect because for the Relocations.cpp
created `directSym`, it's difficult to know whether the associated `sym` has
exercised the `!needsPlt(expr)` code path. Just use the conservative `isInIplt`
condition. A non-preemptible ifunc not referenced by non-GOT-generating
non-PLT-generating relocations will have an unneeded `bti c`, but the cost is acceptable.
The second case fixes a bug as well: a -shared link may have non-preemptible ifunc.
Before the patch we did not emit `bti c` and could be wrong if the PLT address escaped.
GNU ld doesn't handle the case: `relocation R_AARCH64_ADR_PREL_PG_HI21 against STT_GNU_IFUNC symbol 'ifunc2' isn't handled by elf64_aarch64_final_link_relocate` (https://sourceware.org/bugzilla/show_bug.cgi?id=28370)
For -shared, if BTI is enabled but PAC is disabled, the PLT entry size increases
from 16 to 24 because we have to select the PLT scheme early, but the cost is
acceptable.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D110217
This is to build the foundation of a new debug info feature to use only
the base name of template as its debug info name (eg: "t1" instead of
the full "t1<int>"). The intent being that a consumer can still retrieve
all that information from the DW_TAG_template_*_parameters.
So gno-simple-template-names is business as usual/previously ("t1<int>")
=simple is the simplified name ("t1")
=mangled is a special mode to communicate the full information, but
also indicate that the name should be able to be simplified. The data
is encoded as "_STNt1|<int>" which will be matched with an
llvm-dwarfdump --verify feature to deconstruct this name, rebuild the
original name, and then try to rebuild the simple name via the DWARF
tags - then compare the latter and the former to ensure that all the
data necessary to fully rebuild the name is present.
The work that IRExecutionUnit::CollectFallbackNames is basically the
work that `CPlusPlusLanguage::GetDemangledFunctionNameWithoutArguments`
does already. It's also (at time or writing) specific to C++, so it can
be folded into `IRExecutionUnit::CollectCandidateCPlusPlusNames`.
Differential Revision: https://reviews.llvm.org/D109928
Neither of these passes modify the CFG, allowing us to preserve DomTree
and LoopInfo across them by using setPreservesCFG.
Differential Revision: https://reviews.llvm.org/D110161
Allow multiversioning declarations to match when the actual formal
linkage matches, not just when the storage class is identical.
Additionally, change the ambiguous 'linkage' mismatch to be more
specific and say 'language linkage'.
The return type of strlen is size_t, not just any integer.
This is a partial fix for an example based on:
https://llvm.org/PR50836
There's another bug here because we can still crash
processing a real strlen or something that looks like it.
This applies to -Wunused-but-set-variable and
-Wunused-but-set-parameter.
This addresses bug 51865.
Differential Revision: https://reviews.llvm.org/D109862
If no interchange vector is given initialize it with the identity permutation from 0 to number of loops.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D110249
For large app, dumping disasm of the whole program can be slow and result in gianant output. Adding a switch to dump specific symbols only.
Reviewed By: wlei
Differential Revision: https://reviews.llvm.org/D110079
The pass uses different cost kinds to estimate "old" and "interleaved" costs:
default cost kind for all targets override `getInterleavedMemoryOpCost()` is
`TCK_SizeAndLatency`. Although at the moment estimated `TCK_Latency` costs are
equal to `TCK_SizeAndLatency`, (so the change is NFC) it may change in future.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D110100
This doesn't add all of the document numbers, but it adds a bunch of
them. Not all of the documents are available on the committee page
(they're old enough that they come from a time when the mailing was
comprised of physical pieces of paper), so some of the documents listed
are assumed to be correct based on my reading of editor's reports.
When determining whether to fold branches to a common destination by
merging two blocks, SimplifyCFG will count the number of instructions to
be moved into the first basic block. However, there's no reason to count
free instructions like bitcasts and other similar instructions.
This resolves missed branch foldings with -fstrict-vtable-pointers in
llvm-test-suite's lambda benchmark.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D108837
Specifying .global and .weak causes a compiler warning:
warning: __sigsetjmp changed binding to STB_WEAK
Specifying only .weak should have the same effect without causing a
warning.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D110178
Initially, lli only supported lazy mode for ORC. Greedy mode was added with e1579894d2 and it's the default setting now. DebugObjectManagerPlugin tests don't rely on laziness, so we can switch them to greedy in order to avoid some unnecessary complexity.
This patch adds support for an RAII struct that will print function
traces when placed inside of a function declaration. Each successive
call will increase the indentation to make it easier to visually
inspect.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110202
This revision removes the ad-hoc MemRefs that were needed using the old
ABI (when we still passed by value) and replaces them with the shared
StridedMemRef definitions of CRunnerUtils (possible now that we pass by
pointer). This avoids code duplication and makes sure we have a consistent
view of strided memory references in all our support libraries.
Reviewed By: jsetoain
Differential Revision: https://reviews.llvm.org/D110221
We can use riscv_vse intrinsic instead of riscv_vse_mask. The code here
is based on similar code for handling masked.scatter and vp.scatter.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D110206