Summary: Update Test (EXPECT_EQ and friends) to accept __uint128_t and floating point types (float, double, long double).
Reviewers: sivachandra
Subscribers: mgorny, libc-commits
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D83931
The AMDGPU handling of f16 vectors is terrible still since it gets
scalarized even when the vector operation is legal.
The code is is essentially duplicated between the non-strict and
strict case. Apparently no other expansions are currently trying to do
this. This is mostly because I found the behavior of
getStrictFPOperationAction to be confusing. In the ARM case, it would
expand strict_fsub even though it shouldn't due to the later check. At
that point, the logic required to check for legality was more complex
than just duplicating the 2 instruction expansion.
Add support for creating static libraries when the input includes only
Mach-O binaries (and not libraries/archives themselves).
Reviewed by alexshap, Ktwu, smeenai, jhenderson, MaskRay, mtrent
Differential Revision: https://reviews.llvm.org/D83002
SUMMARY:
when we call memset, memcopy,memmove etc(this are llvm intrinsic function) in the c source code. the llvm will generate IR
like call call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
for c source code
bash> cat test_memset.call
struct S{
int a;
int b;
};
extern struct S s;
void bar() {
memset(&s, s.b, s.b);
}
like
%struct.S = type { i32, i32 }
@s = external global %struct.S, align 4
; Function Attrs: noinline nounwind optnone
define void @bar() #0 {
entry:
%0 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
%1 = trunc i32 %0 to i8
%2 = load i32, i32* getelementptr inbounds (%struct.S, %struct.S* @s, i32 0, i32 1), align 4
call void @llvm.memset.p0i8.i32(i8* align 4 bitcast (%struct.S* @s to i8*), i8 %1, i32 %2, i1 false)
ret void
}
declare void @llvm.memset.p0i8.i32(i8* nocapture writeonly, i8, i32, i1 immarg) #1
If we want to let the aix as assembly compile pass without -u
it need to has following assembly code.
.extern .memset
(we do not output extern linkage for llvm instrinsic function.
even if we output the extern linkage for llvm intrinsic function, we should not out .extern llvm.memset.p0i8.i32,
instead of we should emit .extern memset)
for other llvm buildin function floatdidf . even if we do not call these function floatdidf in the c source code(the generated IR also do not the call __floatdidf . the function call
was generated in the LLVM optimized.
the function is not in the functions list of Module, but we still need to emit extern .__floatdidf
The solution for it as :
We record all the lllvm intrinsic extern symbol when transformCallee(), and emit all these symbol in the AsmPrinter::doFinalization(Module &M)
Reviewers: jasonliu, Sean Fertile, hubert.reinterpretcast,
Differential Revision: https://reviews.llvm.org/D78929
Use 'o' for the mangling specification instead of 'e'. This fixes an
error in the backend caused by a mismatch between the data layouts
generated by the backend and the frontend.
rdar://problem/64168540
Summary:
It turns out the `CHECK(addr >= reinterpret_cast<upt>(info.dli_saddr)`
can fail because on armv7s on iOS 9.3 `dladdr()` returns
`info.dli_saddr` with an address larger than the address we provided.
We should avoid crashing here because crashing in the middle of reporting
an issue is very unhelpful. Instead we now try to compute a function offset
if the value we get back from `dladdr()` looks sane, otherwise we don't
set the function offset.
A test case is included. It's basically a slightly modified version of
the existing `test/sanitizer_common/TestCases/Darwin/symbolizer-function-offset-dladdr.cpp`
test case that doesn't run on iOS devices right now.
More details:
In the concrete scenario on armv7s `addr` is `0x2195c870` and the returned
`info.dli_saddr` is `0x2195c871`.
This what LLDB says when disassembling the code.
```
(lldb) dis -a 0x2195c870
libdyld.dylib`<redacted>:
0x2195c870 <+0>: nop
0x2195c872 <+2>: blx 0x2195c91c ; symbol stub for: exit
0x2195c876 <+6>: trap
```
The value returned by `dladdr()` doesn't make sense because it points
into the middle of a instruction.
There might also be other bugs lurking here because I noticed that the PCs we
gather during stackunwinding (before changing them with
`StackTrace::GetPreviousInstructionPc()`) look a little suspicious (e.g. the
PC stored for the frame with fail to symbolicate is 0x2195c873) as they don't
look properly aligned. This probably warrants further investigation in the future.
rdar://problem/65621511
Reviewers: kubamracek, yln
Subscribers: kristof.beyls, llvm-commits, #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D84262
Summary:
Need to avoid an optimization for base pointer mapping for target data
directives.
Reviewers: jdoerfert, ye-luo
Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D84182
This reverts commit bb8850d34d.
It broke 3 check-llvm-transforms-loopfusion tests in an ASAN build.
LoopFuse.cpp `for (BasicBlock *Pred : predecessors(BB)) {` may operate on a deleted BB.
This commons out a chunk of the different two operand MVE patterns into
a single helper multidef. Or technically two multidef patterns so that
the Dup qr patterns can also get the same treatment. This is most of the
two address instructions that we have some codegen pattern for (not ones
that we select purely from intrinsics). It does not include shifts,
which are more spread out and will need some extra work to be given the
same treatment.
Differential Revision: https://reviews.llvm.org/D83219
Every override returns true and its return value is never checked. I can't
see how clearing an OptionValue could fail, or what you would
do if it did. The return serves no purpose.
Differential Revision: https://reviews.llvm.org/D84253
The test doesn't really demonstrate the use of the debug_loc.dwo section
distinct from the debug_loc section for strings in debug_macro.dwo -
because there are no strings that appear uin debug_loc.dwo that weren't
already in debug_loc, so the indexes would remain the same even if the
section that was used was fixed (to use debug_loc.dwo as per spec).
Current tail duplication in machine block placement pass uses block frequency
information in cost model. But frequency number has only relative meaning
compared to other basic blocks in the same function. A large frequency number
doesn't mean it is hot and a small frequency number doesn't mean it is cold.
To overcome this problem, this patch uses profile count in cost model if it's
available. So we can tail duplicate real hot basic blocks.
Differential Revision: https://reviews.llvm.org/D83265
This CL allows asan allocator in fuchsia to decommit shadow memory
for memory allocated using mmap.
Big allocations in asan end up being allocated via `mmap` and freed with
`munmap`. However, when that memory is freed, asan returns the
corresponding shadow memory back to the OS via a call to
`ReleaseMemoryPagesToOs`.
In fuchsia, `ReleaseMemoryPagesToOs` is a no-op: to be able to free
memory back to the OS, you have to hold a handle to the vmo you want to
modify, which is tricky at the ReleaseMemoryPagesToOs level as that
function is not exclusively used for shadow memory.
The function `__sanitizer_fill_shadow` fills a given shadow memory range
with a specific value, and if that value is 0 (unpoison) and the memory
range is bigger than a threshold parameter, it will decommit that memory
if it is all zeroes.
This CL modifies the `FlushUnneededASanShadowMemory` function in
`asan_poisoning.cpp` to add a call to `__sanitizer_fill_shadow` with
value and threshold = 0. This way, all the unneeded shadow memory gets
returned back to the OS.
A test for this behavior can be found in fxrev.dev/391974
Differential Revision: https://reviews.llvm.org/D80355
Change-Id: Id6dd85693e78a222f0329d5b2201e0da753e01c0
`Metadata` is being changed from an `llvm::Any` to a `MatchConsumer<llvm::Any>`
so that it's evaluation can be be dependent on on `MatchResult`s passed in.
Reviewed By: ymandel, gribozavr2
Differential Revision: https://reviews.llvm.org/D83820
... on systems where wait() isn't one of the declarations transitively included
via unistd.h (i.e. Darwin).
Differential Revision: https://reviews.llvm.org/D84207
Introduces the scatter/gather operations to the Vector dialect
(important memory operations for sparse computations), together
with a first reference implementation that lowers to the LLVM IR
dialect to enable running on CPU (and other targets that support
the corresponding LLVM IR intrinsics).
The operations can be used directly where applicable, or can be used
during progressively lowering to bring other memory operations closer to
hardware ISA support for a gather/scatter. The semantics of the operation
closely correspond to those of the corresponding llvm intrinsics.
Note that the operation allows for a dynamic index vector (which is
important for sparse computations). However, this first reference
lowering implementation "serializes" the address computation when
base + index_vector is converted to a vector of pointers. Exploring
how to use SIMD properly during these step is TBD. More general
memrefs and idiomatic versions of striding are also TBD.
Reviewed By: arpith-jacob
Differential Revision: https://reviews.llvm.org/D84039
Outside of compiler-rt (where it's arguably an anti-pattern too),
LLVM tries to keep its build files as simple as possible. See e.g.
llvm/docs/SupportLibrary.rst, "Code Organization".
Differential Revision: https://reviews.llvm.org/D84243
OptNoneInstrumentation is part of StandardInstrumentations. It skips
functions (or loops) that are marked optnone.
The feature of skipping optional passes for optnone functions under NPM
is gated on a -enable-npm-optnone flag. Currently it is by default
false. That is because we still need to mark all required passes to be
required. Otherwise optnone functions will start having incorrect
semantics. After that is done in following changes, we can remove the
flag and always enable this.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D83519
By using default arguments the caller can specify a subset without the
need for overloads. This is particularly useful in combination with
emplace_back as these objects are generally stored in a vector.
Summary:
FormattersContainer stores LLDB's formatters. It's implemented as a templated
map-like data structures that supports any kind of value type and only allows
ConstString and RegularExpression as the key types. The keys are used for
matching type names (e.g., the ConstString key `std::vector` matches the type
with the same name while RegularExpression keys match any type where the
RegularExpression instance matches).
The fact that a single FormattersContainer can only match either by string
comparison or regex matching (depending on the KeyType) causes us to always have
two FormatterContainer instances in all the formatting code. This also leads to
us having every type name matching logic in LLDB twice. For example,
TypeCategory has to implement every method twice (one string matching one, one
regex matching one).
This patch changes FormattersContainer to instead have a single `TypeMatcher`
key that wraps the logic for string-based and regex-based type matching and is
now the only possible KeyType for the FormattersContainer. This means that a
single FormattersContainer can now match types with both regex and string
comparison.
To summarize the changes in this patch:
* Remove all the `*_Impl` methods from `FormattersContainer`
* Instead call the FormatMap functions from `FormattersContainer` with a
`TypeMatcher` type that does the respective matching.
* Replace `ConstString` with `TypeMatcher` in the few places that directly
interact with `FormattersContainer`.
I'm working on some follow up patches that I split up because they deserve their
own review:
* Unify FormatMap and FormattersContainer (they are nearly identical now).
* Delete the duplicated half of all the type matching code that can now use one
interface.
* Propagate TypeMatcher through all the formatter code interfaces instead of
always offering two functions for everything.
There is one ugly design part that I couldn't get rid of yet and that is that we
have to support getting back the string used to construct a `TypeMatcher` later
on. The reason for this is that LLDB only supports referencing existing type
matchers by just typing their respective input string again (without even
supplying if it's a regex or not).
Reviewers: davide, mib
Reviewed By: mib
Subscribers: mgorny, JDevlieghere
Differential Revision: https://reviews.llvm.org/D84151
This avoids massive warning spam due to the unit tests' use of gtest and gmock, which do not use the 'override' keyword in their sources.
Differential Revision: https://reviews.llvm.org/D84213
RecordInterestingDirectory was added to collect dSYM bundles and their
content. For the current working directory we only want the directory to
be part of the VFS, not necessarily its contents. This patch renames the
current method to RecordInterestingDirectoryRecursively and adds a new
one that's not recursive.
Summary:
This patch adds the ability to peel off iterations of the first loop in loop
fusion. This can allow for both loops to have the same trip count, making it
legal for them to be fused together.
Here is a simple scenario peeling can be used in loop fusion:
for (i = 0; i < 10; ++i)
a[i] = a[i] + 3;
for (j = 1; j < 10; ++j)
b[j] = b[j] + 5;
Here is we can make use of peeling, and then fuse the two loops together. We can
peel off the 0th iteration of the loop i, and then combine loop i and j for
i = 1 to 10.
a[0] = a[0] +3;
for (i = 1; i < 10; ++i) {
a[i] = a[i] + 3;
b[i] = b[i] + 5;
}
Currently peeling with loop fusion is only supported for loops with constant
trip counts and a single exit point. Both unguarded and guarded loops are
supported.
Author: sidbav (Sidharth Baveja)
Reviewers: kbarton, Meinersbur, bkramer, Whitney, skatkov, ashlykov, fhahn, bmahjour
Reviewed By: bmahjour
Subscribers: bmahjour, mgorny, hiraditya, zzheng
Tags: LLVM
Differential Revision: https://reviews.llvm.org/D82927
* If two group members are combined, we should leave just one index in the SHT_GROUP content.
* If a group member is discarded (/DISCARD/ or upcoming -r --gc-sections combination),
we should drop its index in the SHT_GROUP content. LLD currently crashes (`getOutputSection()` is null).
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D84129
Summary:
The purpose of this change is to do a small refactoring of code in
ASTImporterTest.cpp by moving it to ASTImporterFixtures.h in order to
support tests of downstream custom types and minimize the "living
downstream burden" of frequent integrations from community to a
downstream repo that implements custom AST import tests.
Reviewers: martong, a.sidorin, shafik
Reviewed By: martong
Subscribers: balazske, dkrupp, bjope, rnkovacs, teemperor, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83970
In an object file, a "PC Begin" field in a FDE is usually relocated by a
PC-relative relocation. Use a relocation-aware DWARFDataExtractor overload (with
DWARFContext and a reference to its internal .eh_frame representation) to decode
addresses correctly. In an object file, most sections have addresses of zero. So
the displayed addresses are almost always offsets relative to the start of the
associated text section.
DWARFContext::create handles .eh_frame and .rela.eh_frame by itself, so if there
are more than one .eh_frame (technically possible, but almost always erronerous
in practice), this will only handle the first one. Supporting multiple
.eh_frame is beyond the scope of this patch.
Reviewed By: grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D84106
Following tests were disabled for clang-11 after upgrading to
version 5.0 in D82963:
1. openmp/runtime/test/env/kmp_set_dispatch_buf.c
2. openmp/runtime/test/worksharing/for/kmp_set_dispatch_buf.c
They are also failing for clang-12. Thus this temporary disabling
until they are fixed.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D84241
Note: Resubmission with frame pointers force-enabled to fix builds with
-DCOMPILER_RT_BUILD_BUILTINS=False
Summary:
Splits the unwinder into a non-segv (for allocation/deallocation traces) and a
segv unwinder. This ensures that implementations can select an accurate, slower
unwinder in the segv handler (if they choose to use the GWP-ASan provided one).
This is important as fast frame-pointer unwinders (like the sanitizer unwinder)
don't like unwinding through signal handlers.
Reviewers: morehouse, cryptoad
Reviewed By: morehouse, cryptoad
Subscribers: cryptoad, mgorny, eugenis, pcc, #sanitizers
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D83994
The default calling convention needs to save/restore the SVE callee
saves according to the SVE PCS when the function takes or returns
scalable types, even when the `aarch64_sve_vector_pcs` CC is not
specified for the function.
Reviewers: efriedma, paulwalker-arm, david-arm, rengolin
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D84041
SPIR-V lowering does not use `MemrefDescriptor`s when lowering memref
types. This adds rationale for the choice made.
Differential Revision: https://reviews.llvm.org/D84184
This patch introduces conversion pattern for `spv.selection` op.
The conversion can only be applied to selection with all blocks being
reachable. Moreover, selection with control attributes "Flatten" and
"DontFlatten" is not supported.
Since the `PatternRewriter` hook for block merging has not been implemented
for `ConversionPatternRewriter`, merge and continue blocks are kept
separately.
Reviewed By: antiagainst, ftynse
Differential Revision: https://reviews.llvm.org/D83860