Currently the NPVTX work function is marked volatile. This prevents some
optimizations from using this value.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106310
We've been forgetting to add those to most of the <ranges> review.
To avoid forgetting in the future, I added an item in the pre-commit
checklist.
Differential Revision: https://reviews.llvm.org/D106287
Implement pass 3 of bind opcodes from ld64 (which supports both 32-bit and 64-bit).
Pass 3 implementation condenses BIND_OPCODE_DO_BIND_ADD_ADDR_ULEB opcode
to BIND_OPCODE_DO_BIND_ADD_ADDR_IMM_SCALED. This change is already behind an
O2 flag so it shouldn't impact current performance. I verified ld64's output with x86_64 LLD
and they were both emitting the same optimized bind opcodes (although in a slightly different
order). Tested with arm64_32 LLD and compared that with x86 LLD that the order of the bind
opcodes are the same (offset values are different which should be expected).
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D106128
Allow arbitrary strides, and make sure we return the correct result when
the backedge-taken count is zero.
Differential Revision: https://reviews.llvm.org/D106197
Wrap semantics are subtle when combined with multiple exits. This has caused several rounds of confusion during recent reviews, so try to document the subtly distinction between when wrap flags provide <u and <=u facts.
The specific case that triggered this was when inlining a recursive
internal function into itself caused the recursion to go away, allowing
the inliner to mark the function as dead. The inliner marks the SCC as
invalidated but does not provide a new SCC to continue with.
This matches the implementations of ModuleToPostOrderCGSCCPassAdaptor
and CGSCCPassManager.
Fixes PR50363.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D106306
This patch is in a series of patches to provide builtins for
compatibility with the XL compiler. This patch adds software divide
builtins with no checking. These builtins are each emitted as a fast
fdiv.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D106150
The `reifyReturnTypeShapesPerResultDim` method supports shape
inference for rsults that are ranked types. These are used lower in
the codegeneration stack than its counter part `reifyReturnTypeShapes`
which also supports unranked types, and is more suited for use higher
up the compilation stack. To have separation of concerns, this method
is split into its own interface.
See discussion : https://llvm.discourse.group/t/better-layering-for-infershapedtypeopinterface/3823
Differential Revision: https://reviews.llvm.org/D106133
This avoids duplication and simplifies the code in several places
without increasing the size of the symbol union (at least not
above the assert'd limit of 120 bytes).
Differential Revision: https://reviews.llvm.org/D106026
When we go to destroy the process, we first try to halt it, if
we succeeded and the target stopped, we want to clear out the
thread plans and breakpoints in case we still need to resume to complete
killing the process. If the target was exited or detached, it's
pointless but harmless to do this. But if the state is eStateInvalid -
for instance if we tried to interrupt the target to Halt it and that
fails - we don't want to keep trying to interact with the inferior,
so we shouldn't do this work.
This change explicitly checks eStateStopped, and only does the pre-resume
cleanup if we did manage to stop the process.
Debug info sections need R_WASM_FUNCTION_OFFSET_I32 relocs (with FK_Data_4 fixup
kinds) to refer to functions (instead of R_WASM_TABLE_INDEX as is used in data
sections). Usually this is done in a convoluted way, with unnamed temp data
symbols which target the start of the function, in which case
WasmObjectWriter::recordRelocation converts it to use the section symbol
instead. However in some cases the function can actually be undefined; in this
case the dwarf generator uses the function symbol (a named undefined function
symbol) instead. In that case the section-symbol transform doesn't work and we
need to generate the correct reloc type a different way. In this change
WebAssemblyWasmObjectWriter::getRelocType takes the fixup section type into
account to choose the correct reloc type.
Fixes PR50408
Differential Revision: https://reviews.llvm.org/D103557
It turns out that during training, the time required to parse the
textual protobuf of a training log is about the same as the time it
takes to compile the module generating that log. Using binary protobufs
instead elides that cost almost completely.
Differential Revision: https://reviews.llvm.org/D106157
This is the first step to support software pipeline for scf.for loops.
This is only the transformation to create pipelined kernel and
prologue/epilogue.
The scheduling needs to be given by user as many different algorithm
and heuristic could be applied.
This currently doesn't handle loop arguments, this will be added in a
follow up patch.
Differential Revision: https://reviews.llvm.org/D105868
Break an unwrapped line before the first parameter declaration in a
K&R C function definition.
This fixes PR51074.
Differential Revision: https://reviews.llvm.org/D106112
As discussed on D105251, currently the compiler does not support
multiple metadata attachments on instructions having the same
identifier, whereas it does for global objects. Note this in the
Language Reference manual for clarity.
See D105251 for discussions of history behind this divergence, and the
complexities and possible approaches of adding this support to
instructions in the future.
Differential Revision: https://reviews.llvm.org/D106304
This makes it more explicit what the scope of this pass is. The name
of this pass predates fusion on tensors using tile + fuse, and hence
the confusion.
Differential Revision: https://reviews.llvm.org/D106132
We need the compiler generated variable to override the weak symbol of
the same name inside the profile runtime, but using LinkOnceODRLinkage
results in weak symbol being emitted in which case the symbol selected
by the linker is going to depend on the order of inputs which can be
fragile.
This change replaces the use of weak definition inside the runtime with
a weak alias. We place the compiler generated symbol inside a COMDAT
group so dead definition can be garbage collected by the linker.
We also disable the use of runtime counter relocation on Darwin since
Mach-O doesn't support weak external references, but Darwin already uses
a different continous mode that relies on overmapping so runtime counter
relocation isn't needed there.
Differential Revision: https://reviews.llvm.org/D105176
The patch does not depend on the availability of the library functions for
memcpy/memset as it operates on LLVM intrinsics. The optimizations are useful
on the targets that have these functions disabled (e.g. NVPTX & AMDGPU).
Differential Revision: https://reviews.llvm.org/D104801
The new testing configuration did not turn off #pragma system_header,
which means we were not seeing warnings in system headers.
Differential Revision: https://reviews.llvm.org/D106187
This diff changes llvm-ifs to use unified IFS file format
and perform other renaming changes in preparation for the
merging between elfabi/ifs.
Differential Revision: https://reviews.llvm.org/D99810
This change implements unified text stub format and command line
interface proposed in the elfabi/ifs merge plan.
Differential Revision: https://reviews.llvm.org/D99399
Added shape inference handles cases for convolution operations. This includes
conv2d, conv3d, depthwise_conv2d, and transpose_conv2d. With transpose conv
we use the specified output shape when possible however will shape propagate
if the output shape attribute has dynamic values.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105645