Fixed type assertion failure caused by trying to fold a masked load with a
select where the select condition is a scalar value
Reviewed By: sdesmalen, lebedev.ri
Differential Revision: https://reviews.llvm.org/D107372
where should not.
Currently we are using QTy->isIncompleteType(&ND) to check incomplete
type. But before doing that, need to instantiate for a class template
specialization or a class member of a class template specialization,
or an array with known size of such..., so that we know it is really
incomplete type.
To fix this using RequireCompleteType instead.
The new test is added into "test/OpenMP/target_update_messages.cpp"
The different of using RequireCompleteType is when emit incomplete type,
an additional note is also emitted to point to where incomplete type
is declared. Because this change, many tests are needed to be fixed
by adding additional note.
This is to fix https://bugs.llvm.org/show_bug.cgi?id=50508
Differential Revision: https://reviews.llvm.org/D107200
mlir/test/transforms/loop-fusion.mlir is too big and is split into mlir/test/transforms/loop-fusion.mlir, mlir/test/transforms/loop-fusion-2.mlir, mlir/test/transforms/loop-fusion-3.mlir
and mlir/test/transforms/loop-fusion-4.mlir. Further tests can be added in mlir/test/transforms/loop-fusion-4.mlir
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D106473
This change supports to run without parsing MMap binary loading events instead it always assumes binary is loaded at the preferred address. This is used when we have assured no binary load address changes or we have pre-processed the addresses resolution. Warn if there's interior mmap event but without leading mmap events.
Reviewed By: hoy
Differential Revision: https://reviews.llvm.org/D107097
This changes the lowering of f32 and f64 COPY from a 128bit vector ORR to
a fmov of the appropriate type. At least on some CPU's with 64bit NEON
data paths this is expected to be faster, and shouldn't be slower on any
CPU that treats fmov as a register rename.
Differential Revision: https://reviews.llvm.org/D106365
This is available in GNU ld 2.35 and can be seen as a shortcut for multiple
--export-dynamic-symbol, or a --dynamic-list variant without the symbolic intention.
In the long term, this option probably should be preferred over --dynamic-list.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D107317
Return false from runOnFunction if nothing changed. Curiously
we already returned a bool from detectAndFoldOffset, but didn't
use it.
Fix a couple breaks after returns that I saw while auditing
detectAndFoldOffset.
Differential Revision: https://reviews.llvm.org/D107303
A `Vector` that doesn't require an initial `reserve()` (eg: with a
default, or small enough capacity) can have a constant initializer.
This changes the code in a few places to make that possible:
- mark a few other functions as `constexpr`
- do without any `reinterpret_cast`
- allow to skip `reserve` from `init`
Differential Revision: https://reviews.llvm.org/D107308
https://reviews.llvm.org/D105964 updated the detection of function
definitions. It had the unfortunate effect to start marking object
definitions with attribute-like macros as function definitions.
This addresses this issue.
Reviewed By: owenpan
Differential Revision: https://reviews.llvm.org/D107269
mallopt calls are left-over from the times we used
__libc_malloc/__libc_free for internal allocations.
Now we have own internal allocator, so this is not needed.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107342
We had some similar hasOneUse/isNON_EXTLoad early-outs spread out over different parts of the method - we should pull them all together.
Noticed while triaging PR45116
Fix for https://bugs.llvm.org/show_bug.cgi?id=49723.
Eliminated references from task dependency hash to node allocated on stack,
thus eliminated accesses to stale memory. So the node now never freed.
Uncommented assertion which triggered when stale memory accessed.
Removed unneeded ref count increment for stack allocated node.
Differential Revision: https://reviews.llvm.org/D106705
The inttoptr/ptrtoint roundtrip optimization is not always correct.
We are working towards removing this optimization and adding support to specific cases where this optimization works.
In this patch, we focus on phi-node operands with inttoptr casts.
We know that ptrtoint( inttoptr( ptrtoint x) ) is same as ptrtoint (x).
So, we want to remove this roundtrip cast which goes through phi-node.
Reviewed By: aqjune
Differential Revision: https://reviews.llvm.org/D106289
We currently use ad-hoc spin waiting to synchronize thread creation
and thread start both ways. But spinning tend to degrade ungracefully
under high contention (lots of threads are created at the same time).
Use semaphores for synchronization instead.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107337
This patch implements P2092
Simple requirements in requirement body shall not start with requires.
A warning was already in place so we just turn this warning into an error.
In addition, we add tests to make sure typename is optional in
requirement-parameter-list as per the same paper.
As detailed on https://pvs-studio.com/en/blog/posts/cpp/0771/ and raised on D62583, the SecNo++ increment is not guaranteed to occur before the second use of SecNo in the same addSection() call.
This patch pulls out the increment (just for clarity) and replaces the second use of SecNo with a constant zero value (we're using stable_sort so the value isn't critical).
Differential Revision: https://reviews.llvm.org/D107273
Previously we would show an error, but keep the member, and also the
CXXRrecordDecl, valid. This could lead to crashes when attempting to
access the record layout or size.
Differential Revision: https://reviews.llvm.org/D105478
This patch adds the critical construct to the OpenMP dialect. The
implementation models the definition in 2.17.1 of the OpenMP 5 standard.
A name and hint can be specified. The name is a global entity or has
external linkage, it is modelled as a FlatSymbolRefAttr. Hint is
modelled as an integer enum attribute.
Also lowering to LLVM IR using the OpenMP IRBuilder.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D107135
This patch extends the optimization of VID-sequence BUILD_VECTORs
introduced in D104921 to include simple fractional steps composed of a
separated integer numerator and denominator.
A notable limitation in this sequence detection is that only sequences
with steps N/1 or 1/D are found, meaning that the step between elements
and the frequency with which it changes is consistent across the whole
sequence. Fractional steps such as 2/3 won't be matched as those would
involve more complex tracking of state or some level of backtracking.
As is stands, however, this patch is sufficient to match common
interleave-type shuffle indices, for example matching `<0,0,1,1>` (or
commonly `<0,u,1,u>` or `<u,0,u,1>`) to an index sequence divided by 2.
While the optimization is relatively `undef`-tolerant, due to greedy
pattern-matching there even are some simple patterns which confuse the
sequence detection into identifying either a suboptimal sequence or no
sequence at all.
Currently only fractional-step sequences identified as having a
power-of-two denominator are actually lowered to RVV instructions. This
is to avoid introducing divisions into the generated code.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D106533
Add a comment when there is a shifted value,
add x9, x0, #291, lsl #12 ; =1191936
but not when the immediate value is unshifted,
subs x9, x0, #256 ; =256
when the comment adds nothing additional to the reader.
Differential Revision: https://reviews.llvm.org/D107196
Store both interfaceID and objectID as key for interface registration callback.
Otherwise the implementation allows to register only one external model per one object in the single dialect.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D107274
The patch https://reviews.llvm.org/D105964 (58494c856a)
updated detection of function declaration names. It had the unfortunate
consequence that it started breaking between `function` and the function
name in some cases in JavaScript code.
This patch addresses this.
Reviewed By: MyDeveloperDay, owenpan
Differential Revision: https://reviews.llvm.org/D107267
Currently unaligned access functions are defined in tsan_interface.cpp
and do a real call to MemoryAccess. This means we have a real call
and no read/write constant propagation.
Unaligned memory access can be quite hot for some programs
(observed on some compression algorithms with ~90% of unaligned accesses).
Move them to tsan_interface_inl.h to avoid the additional call
and enable constant propagation.
Also reorder the actual store and memory access handling for
__sanitizer_unaligned_store callbacks to enable tail calling
in MemoryAccess.
Depends on D107282.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D107283
item of StringTable.
Summary: For the string table in XCOFF, the first 4 bytes
contains the length of the string table, so we should
print the string entries from fifth bytes. This patch
also adds tests for llvm-readobj dumping the string
table.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D105522
Add AccessVptr access type.
For now it's converted to the same thr->is_vptr_access,
but later it will be passed directly to ReportRace
and will enable efficient tail calling in MemoryAccess function
(currently __tsan_vptr_update/__tsan_vptr_read can't use
tail calls in MemoryAccess because of the trailing assignment
to thr->is_vptr_access).
Depends on D107276.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D107282
Currently we have MemoryAccess function that accepts
"bool kAccessIsWrite, bool kIsAtomic" and 4 wrappers:
MemoryRead/MemoryWrite/MemoryReadAtomic/MemoryWriteAtomic.
Such scheme with bool flags is not particularly scalable/extendable.
Because of that we did not have Read/Write wrappers for UnalignedMemoryAccess,
and "true, false" or "false, true" at call sites is not very readable.
Moreover, the new tsan runtime will introduce more flags
(e.g. move "freed" and "vptr access" to memory acccess flags).
We can't have 16 wrappers and each flag also takes whole
64-bit register for non-inlined calls.
Introduce AccessType enum that contains bit mask of
read/write, atomic/non-atomic, and later free/non-free,
vptr/non-vptr.
Such scheme is more scalable, more readble, more efficient
(don't consume multiple registers for these flags during calls)
and allows to cover unaligned and range variations of memory
access functions as well.
Also switch from size log to just size.
The new tsan runtime won't have the limitation of supporting
only 1/2/4/8 access sizes, so we don't need the logarithms.
Also add an inline thunk that converts the new interface to the old one.
For inlined calls it should not add any overhead because
all flags/size can be computed as compile time.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D107276